[Doc-SIG] formalizing StructuredText

Edward Welbourne Edward Welbourne <eddy@chaos.org.uk>
Sat, 31 Mar 2001 11:16:38 +0100 (BST)


> ... there's an argument to be made for having the environments in
> which emph can start/end be the same as the environments that literal
> can start/end in.
hmm.  Good point.

> ... allow things like '  x  '.  And actually, thinking
> about it now, I don't see why we would want to..

Inlines, whether code or verbatim - at least when appearing in a
paragraph which user agents are at liberty to fit neatly into a width
chosen by the user, using the standard text-flow idioms - must live by
the rules of paragraph-text: which say that any sequence of horizontal
space characters may be replaced by a single space or by a
newline-and-indentation; and that these `mean' the same thing as one
another - if the reader choses where to break the paragraph into lines,
the author's choice of where to do so in the source file should be
ignored.  Furthermore, someone modifying a doc-string paragraph may fail
to notice a verbatim fragment at the far end of the paragraph and,
having changed the text, asked some random editor to do what it thinks
of as paragraph reformat; in the action of which the most we can rely on
is that its treatment of white space will not interfere with the rules
above.

Ergo, before looking for any colouring markup, *including* 'verbatim',
within a flowable paragraph the paragraph *should* have each
newline-and-indent replaced with a single white space (optionally
swallowing any trailing space on the line thus ended) and *may* have
each sequence of horizontal space replaced with a single horizontal
space at the same time, on the grounds that if what the text then says
isn't `the same' as the author's intent, the author will be tripping
over someone's editor's formatting about as soon as someone who doesn't
know about the fragile inline gets to edit the doc-string.  None the
less (especially for verbatim inlines) the user agent is equally at
liberty to honour horizontal space, once each newline-and-indent has
turned into a space.

To forbid inlines, including verbatim, from including spaces would be
excessive; to allow them, we must allow inlines, at least within
paragraphs, to stradle line-breaks; and authors must accept that space
may be re-arranged - either by the maintainer's editor's
paragraph-reformatter, or by the user agent displaying the text as a
paragraph.  To avoid conflict with what it's reasonable to expect
authoring and display tools to do, the doc-string toolkit should avoid
promising that whitespace within paragraphs will be taken literally; by
avoiding that promise, it can allow inlines to straddle line boundaries
within paragraphs, with the newline-and-indent understood as a simple
space, and on the understanding that the author has enough sense not to
use inline fragments which will be sensitive to space-munging.

This must equally hold for `verbatim' inlines, at least when they appear
in text-paragraphs: but, if they're honest citizens of the paragraph,
they shouldn't *mind* being `interpreted' in the mild degree called for;
and the tools are at liberty to *only* reduce newline-and-indent,
without taking the trouble to normalise other sequences of space.
Either the fragment shouldn't be inlined in a text-paragraph, or its
meaning should be unchanged by messing with its whitespace.  The way to
say a verbatim fragment which doesn't keep to those rules is like this::

	 '  x  '

Putting it in a separate block gives subsequent folk editing the
doc-string the option of reformatting the paragraphs which look like
paragraphs, safe in the knowledge that anything this might mess up will
be isolated in its own block; and will be needed if the doc-tools are to
present the fragment to the user agent in a style which *does* promise
to preserve space.

Anything inside a paragraph should be understood as equivalent to
whatever valid text flow could turn it into (which only ever reduces
sequences of blank characters (space, newline, tab, form-feed, etc.)
to a single space or a newline-and-indent, so an inline only containing
these will never `expand' to contain repeats or other blanks).

Now, outside the paragraph, one may note that verbatim fragments, e.g.
used as the items in a list, have a right to have fancy spacing
honoured: but it may be simpler to apply the same rule as in a paragraph
(especially if your grammar regards `list item' as a variety of text
paragraph).  But, at least inside paragraphs, inlines, even verbatim,
should be interpreted `space-insensitively' with each newline-and-indent
normalised to a single space.

	Eddy.