[Doc-SIG] Lazy paragraph identation

Sat, 21 Jul 2001 19:04:20 +1000

> > It sure looks to me like a requirement for a switchable mode in the
> > parser. Different applications can choose different defaults.
>
> This is workable. If you can come up with consistent, unambiguous,
safe
> rules for lazy indentation, then Wikis and other apps could use the
lazy
> variant.

Wicked.

> > Or, the parser could attempt to automatically figure it out.
>
> That's a dangerous path. Explicit is better.

Fair enough.

This quote is out of order because it's more important:

> Although not explicitly stated in the spec (yet), the way
> I've implemented the parser is to do line/block parsing first,
> then inline markup parsing afterwards (standalone URI parsing last).

"Sorry, changing the parser order is just too hard at this stage to
relax the requirement for blank lines between list entries in lazy mode"
is a perfectly reasonable argument in favour of that requirement, and
I'm entirely happy to accept it. [3]_

.. _[3] You have no idea how frustrated my fiance gets when, after
   attempting to justify a decision with several sadly illogical [4]_
   arguments in its favour and listening to me patiently dissect
   and dismiss each one, discovers that "I feel like it, okay?"
   was all that she needed to say.

   Well, maybe you have a slight idea. :)

   One of these days, I'm going to clue up and ask right after the
   first one whether she just feels like it. It'll save a lot of
   time and angst. Similarly, I should have asked up front whether
   the implementation of my proposal was going to be difficult.

.. _[4] No, this is not an attempt to slyly call your arguments
   illogical. Misguided, much Frowned upon by God, and if not
   abandoned sure to lead to your Eternal Damnnation in Hell,
   but not illogical by any shake of the stick. :)

The now sadly irrelevant argument in favour of a less strict lazy mode
follows anyway.

Summarizing the issue of badly wrapped lists and lazy mode:

* We're either in lazy mode, or not. No automatic selection. Cool.

* The following example is still contentious::

    - This is list item 1. Here's a formula: "x = x
    - 1".
    - Here's list item 2. Sure looks like item 3 though.

* The strict approach to the example:

  * A parser permitting lazy indentation without insisting upon
    blank lines would interpret the above as "three lists", and
  * A human reader strictly reading the specification would
    reach a similar conclusion, but
  * That's obviously not what the user intended when they wrote ::

     - This is list item 1. Here's a formula: "x = x - 1".

    before their editor badly wrapped the line.
  * We can could this disconnect "ambiguity",
  * In the parser world, "ambiguity" is a bad word,
  * Therefore blank lines **must** be insisted upon between list items
    in lazy mode.

* Arguments in favour of being more forgiving:

  *Ambiguity ain't always that ambiguous*:
    The kind of ambiguity we're most worried about is *circumstances
    for which the parser's behaviour is undefined*. The parser needs
    to be able to consistently make a decision, and programmers
    implementing parsers need to be able to make a decision.

    This clearly isn't such a case. The user will be typing a list.
    When they see the results, they'll mutter dark words about the
    stupid editor their company insist they use, and they'll fix
    the markup somehow (see below).

    If asked "hey, do you consider what just happened ambiguous?",
    I don't imagine many users would reply in the affirmative.
    They explicitly typed something. Their editor explicitly stuffed
    it up. The parser explicitly interpreted the text, and the user
    explicitly said expletives and explicitly fixed the problem.

    Any confusion in the user's mind when seeing the output will
    disappear when the system sends them their text back for editing
    and they see what their text editor did.

  *Consider the user impact*:
    This kind of a strict "never suffer ambiguity to live" attitude
    imposes a heavy burden on the user every time they use a list
    (probably quite often) in order to save them from something
    untoward that might happen to them only once a year, if ever.

    A comparison might be made to money handling. If your current
    cash register techniques occasionally let minor mistakes to be
    made, you could well lose hundreds of dollars per year.
    Insisting that all totals are manually verified by a supervisor
    will save those hundreds of dollars, but cost tens of thousands
    in additional salary. Moreover, all of your customers might
    abandon your store because they're sick of the hassle.

  *Users can avoid the problem very, very easily*:

    Any user aware that their editor wraps lines for them, and
    aware that a copy of the list delimiter unfortunately wrapped
    to the beginning of the line will cause the parser to start
    a new list item, will do one of the following:

    * Manually wrap such a long item well before the wrap point::

        - This is list item 1.
          Here's a formula: "x = x - 1".
        - Here's list item 2.

    * Choose a different list delimiter.

    * Use literals (assuming the parser is changed so that literals
      bind harder than the beginning of list items).

    * Drop into strict mode temporarily: _[2] ::

        .. strict::

        - This is list item 1, which contains a formula that
          I'm not sure will wrap appropriately, so I'm going
          to drop into strict mode and manually wrap each and
          every line well before the wrap point.

          Anyway, here's the formula: "x = x - 1".

        - Here's list item 2.

        .. lazy::

    I suspect the first two will be slightly more popular. :)

    Any user waking up regularly dripping with sweat because of
    recurring nightmares about having to go back and fix their
    markup will, I think, go to the effort of finding an editor
    that will write their markup for them.

  *What would the user choose?*:
     Given a choice between the following:

     * a *strict* mode that insists that users manually wrap each
       and every line well before their editor's wrap point *and*
       manually indent those lines as well,
     * a *strictly lazy* mode that relaxes the requirements for
       manual wrapping and indentation but insists upon blank lines
       between all list items, and
     * a hypothetical *bloody lazy* [1]_ mode that doesn't insist
       upon those blank lines but that requires users to consider
       editor wrap points when putting list delimiters in the middle
       of list items,

     I somewhat suspect that many users would end up being bloody
     lazy. Certainly, if bloody laziness were the default, I
     sincerely doubt that many people would bother switching to a
     stricter mode, even if they got caught out once or twice.

.. _[1] There's the `Queen's English` again.

.. _[2] Well, there's an example of a parser directive, if we need
   one.

> Yes, but we are trying to avoid surprises when accidental bad
> wrapping takes place. The user doesn't always have control.
> My email client wraps my paragraphs, even if I don't want it to.

Well, exactly, but there's nothing wrong with surprises if the user can
figure out how to respond to the surprise. Users are going to be
stuffing up quite often, will be surprised to see that what they did
didn't work, and will look at their markup again and maybe refer to the
specification to figure out what happened and what to do about it.

If we're not worried about that (leading to directives like: "users must
never write their own markup, but must use an editor that doesn't let
them make mistakes"), why are we worried about this wrapping and list
items issue?

The user has enough control over the wrapping to force a wrap earlier
than the parser did, which is more than s/he needs to either dodge or
fix the problem.

> > * The *user* might be a little confused for a moment.
> >
> > The user is going to spend a lot of time confused regardless.
>
> Confusion is OK, as long as it stems from ignorance;
> education/experience fixes that. Confusion stemming from surprising
> (even if *very occasionally* surprising) side-effects of the markup,
> that's not acceptable.

Call it a side-effect of the editor. If anyone gets particularly
detail-oriented and angst ridden about the whole thing, direct them to
the list archives (of which I'm sure I'm going to be sufficiently
embarrassed), point out that it's all my fault, and give them my email
address. :)

> Writing the spec and implementing the parser, I've tried to
> avoid surprises and ambiguity wherever possible. If avoidance is
> not possible, then the possible surprises have to be minimized,
> explicity documented, and warned of by the parser. Also, there
> has to be an "out" or workaround (which is where
> backslash-escapes come in handy).

Let's say that it were impossible to insist on the blank lines for
non-technical reasons (the managing director hates them). I think the
possible surprises are minimal, I'll write the documentation, I'll try
and figure out a way to warn about the situation (spotting a broken
literal is the easiest way until we climb into the ordered list
rat-hole), and there's an easy out. Close enough?

> > More glibly put: two out of three ain't bad. I think
> > they'll cope. :)
>
> You're a programmer. Imagine if Python had funny edge cases. Would
> you *cope*? Or would you scream bloody murder?

Python surprises me every week. Then I figure out that my editor broke
the indentation. I fix what my editor broke, and keep working. I cope.
:)

> Out of respect for the eventual users of reStructuredText, we can't
> allow *any* surprises.

	We're doing it for your own good!

Out of respect for people already suffering crummy editors, I'm trying
to cut them as many breaks as I can. Users who absolutely cannot stand
surprises can always turn on strictness or strict laziness, eh?

It just occurred to me that I've spent more time discussing this than I
could possibly have spent as a user swearing about needing to put blank
lines in. Sorry about that.

I'm mainly worried about people cutting and pasting mail in to their web
browser (it'll happen). Saving them the effort of breaking the bullet
lists apart seems like a fair thing.

> It will be great if you can come up with a consistent
> indentation-minimized syntax; I'm all for it.

Still working on it!

Oh, the shame: a Python programmer trying to figure out how to avoid
indenting...

> All you need to do is devise an alternativerepresentation of
> hierarchical structures, one that doesn't use indentation
> or begin/end markers. If it *does* use begin/end markers,
> we'll call it something else ;-), and start another parser
> component project for it.

If it had to use begin and end markers, we may as well write it in
*Perl*. Ewwwww...

Regards,
Garth.