[Doc-SIG] Lazy paragraph identation

Garth T Kidd garth@deadlybloodyserious.com
Fri, 20 Jul 2001 12:24:39 +1000


.. This also used to be in the ``re: reStructuredText`` thread, but it's
   sufficiently contentious that I think it deserves its own subject
line.

    ... being able to recognise indented paragraphs solely by
    their first line, so that people can lazily just keep typing
    (like I'm doing now) without having to manually terminate
    lines and indent the next one.

  Unfortunately, that syntax is ambiguous if blank lines between
  list items are optional, which reStructuredText allows. You can
  have one or the other, not both.

.. The spec supports nested block quotes, right?

Consider:

* People who want to use reStructuredText in docstrings need the
  blank lines between list items to be optional, and will be using
  proper programming editors that can handle indentation for them.

  This group is quite obvious at the moment.

* People who want to write reStructuredText in mail clients and
  web browsers will be constantly frustrated if they are forced to
  manually indent everything, and won't mind at all if they're
  forced to put blank lines between list items.

  Nobody appears to have spent much time considering the requirements
  of this group yet.

Those are both sizable target groups, right? Now, I believe the
following:

* We don't want to exclude either of those target groups.

* On the other hand, we don't want to make reStructuredText
  ambiguous.

It sure looks to me like a requirement for a switchable mode in the
parser. Different applications can choose different defaults. Or, the
parser could attempt to automatically figure it out. If the very first
bullet point or indented paragraph you see looks like this, you probably
want to select for lazy paragraph indenting::

   * If the line after the first bullet point or indented paragraph
   starts at column zero and is not empty, lazy paragraph indenting
   can be assumed by applications that expect that some users might
   be using crummy editors.

Docstring processors would explicitly suppress such automatic selection.

You point out ambiguity in your example of a badly wrapped paragraph
containing the bullet selector::

  - This is list item 1. Here's a formula: "x = x
  - 1".
  - Here's list item 2. Sure looks like item 3 though.

.. _abuse of the word "ambiguous":

To me, that's not ambiguous. The bad wrapping makes it explicitly a
three item list. It's not what the user intended [2]_, but there are so
many ways for the user to unambiguously fix it I don't think it's a
problem:

 * Manually wrap it closer to column zero::

     - This is list item 1.
     Here's a formula: "x = x - 1"
     - Here's list item 2...

 * Use a different bullet::

     * This is list item 1. Here's a formula: "x = x
     - 1".
     * Here's list item 2.

   Implication: a rule in the parser that says that blank lines
   are required between adjacent but different lists at the same
   indentation level, even if lazy paragraph formatting is turned
   on.

   That nicely matches the

 * Use an inline literal::

     - This is list item 1. Here's a formula: ``x = x
     - 1``.
     - Here's list item 2, as the parser considers the second
     line in this example part of the literal started in line 1.

> The Doc-SIG historical record shows that allowing intra-list-item
> blank lines to be optional is more in demand.

I can *readily* imagine that intra-list-item blank lines being optional
is more in demand at the moment.

The majority of the people discussing this specification are probably
Python programmers who want to use it for Python code (in docstrings)
and the documentation for their Python code which they'll probably be
editing in the same indentation-smart text editor they use for their
code.

> Opinions or counter-arguments anyone?

I'm not sure we should dig your heels in and assert that
reStructuredText should *only* be useful for Python programmers with an
indentation-smart text editor.

There are hundreds of billions [1]_ of frustrated Wiki users out there
pounding their heads against the Wiki markup syntax, and almost as many
ZWiki users ripping their hair out because StructuredText is just as bad
or worse. Telling them we're not going to throw them a line and rescue
them from shark infested water because they might get our precious rope
wet seems a tad... stingy.

Getting into the mud on ambiguity:

.. _explicit discussion of ambiguity:

I'm going to come under some well deserved flack for my `abuse of the
word "ambiguous"` above, so I'm going to break it out a little. If the
specification is changed as I suggest, *and* the parser is implemented
as I'm saying, *and* the user tries to do what David suggests, *and*
their text gets badly wrapped in the position David indicates, then:

* The *specification* is not ambiguous, and
* The *parser* won't find the input ambiguous, but
* The *user* might be a little confused for a moment.

The user is going to spend a lot of time confused regardless. Every time
I try and represent a bullet list for which each item owns a literal
block, for example, I forget to indent the literal block and have to go
back and fix it. Users are going to spend a lot of time going back and
fixing things that they got wrong. Going back and fixing the list won't
be any additional hassle.

I'm wary of insisting upon serious inconvenience to a large segment of
the user population for [3]_ to save inconvenience to the occasional
user who stumbles across the edge case of a list item that happens to
have a list delimiter just after the wrap column.

More glibly put: two out of three ain't bad. I think they'll cope. :)

.. _[1] I counted them. Really!

.. _[2] Before firing missiles on my use of the word "ambiguous",
   please see my `explicit discussion of ambiguity`, upon which
   you can unload entire batteries if you want. :)

.. _[3] e.g. having to manually indent every single list item as
   punishment for using an editor that doesn't handle indentation
   properly and that wraps long paragraphs with newlines.


Regards,
Garth.