[Doc-SIG] Re: reStructuredText Markup Specification

David Goodger dgoodger@bigfoot.com
Wed, 13 Jun 2001 00:26:15 -0400


I will address some of Wolfgang's points (the relevant ones :-) here, then
propose a compromise solution in another post.

on 2001-06-06 05:46, Wolfgang Lipp (castor@snafu.de) wrote:
> In other words, there is a chance not only to produce a
> standard for the limited use of docstrings, there is
> definitely a chance to set up a framework in which the
> sources of: (1) the programming language, (2a) its inline
> documentation and (2b) other information materials, as
> well as (3a) configurational files and (3b) databases(*)
> are interpretable.

All reStructuredText addresses is 2a, inline documentation, and possibly 2b,
if that meant documentation in general, as in standalone files. Any
references to types 1, 3a & 3b are irrelevant to the discussion, merely
distractions.

reStructuredText is what-you-see-is-what-you-get plaintext. That's challenge
enough!

> Next, consider what happens when you want to copy
> Levers.TypeA.outer to Bolts:
[example of erroneous editing of 'significant title adornment' sections,
omitted]

Yes, this is a significant issue, one where indentation definitely wins with
current tools.

> Following are my objections.
> 
> 1:  Indention Unnatural?
...
> Therefore, indention is indeed highly 'natural' and
> also 'appropriate'.

Yes, indentation is natural and appropriate for structures with a very
limited, local extent, such as lists, block quotes, and literal blocks. It
is not natural for section structures which can span multiple pages or
screenfuls. Imagine a textbook containing indented sections, say 4 levels
deep (such textbooks do exist; can't think of any titles though). When
you're on the fourth page of section 5.4.3.2, how does indentation help? It
doesn't. Of course, 'significant title adornment' sections are the same.
It's a non-starter; neither wins out.

> However, David's argument in itself is a bit of a
> problem because it is claimed that we users of editors
> should mimick them typographers in their ways (and
> shun indention, because, you see, in books they don't
> use it either), while it is at the same time
> acknowledged that we lack the means to do so (we don't
> have big type, so let's use underline). This is
> contradictory.

It's not "let's mimic the typographers," it's "let's distinguish the titles
somehow." Indentation alone isn't enough; it's used for too many other
things. Typographers use heavy display type for titles because it jumps out
at you. They use boldface and italics in running text for the same reason.
In plaintext we have neither. For inline emphasis, Setext chose **various**
_styles_ `_of_` ~emphasis~; for reStructuredText I chose a coherent,
self-consistent and unambiguous subset.

reStructuredText uses underlining to distinguish the titles, to make the
title text stand out.

> Secondly, it is stated that indention "is usually the
> formatted end-result and is there for aesthetic rather
> than structural purposes" -- well, it seems to me that
> David's underlined section headings are rather
> 'aesthetically' than 'structurally' motivated, at
> least when compared to indention.

Yes, the underlined titles are aesthetic. One interpretation of "aesthetic"
is "parsable to the human eye and brain". The unadorned titles of indented
StructuredText sections are not easily parsed by my eye, even though they
may be no problem to software. They look like one-line paragraphs. Followed
by indented blocks, sure, but where's the significance of that? What's the
precedence?

> It is us who is doing this, and what we are
> looking for is a practical, manageable, readable and
> pleasing way of doing the job, in other words, we are
> looking for a beautiful solution of the problem.

Yes. Beautiful == consistent and coherent and implementable.

['it' == 'indentation':]
> So, typographers don't do it (well, not all of the
> time), but they have other means. We are programmers
> and documentation authors, working on software
> typewriters, so let's do it.

Not a convincing argument.

I understand that many people like indented sections. Many don't. *I* don't,
and I won't use them.

But in order to come up with something everyone might agree on, I've devised
a compromise. See my proposal post.

> 2:  Indention Awkward?
> 
> Indention is elegant. Trying to convince Python people
> of the elegance of indention is unnecessary, they're
> already convinced of this. It should be hard for a
> programmer to accept a scheme that is purportedly
> 'natural' in its indication of 'structure' when it
> uses arbitrary, highly context-sensitive and ambiguous
> lines instead of (any or all of) indention,
> parentheses, begin-end-commands, i.e. those means that
> are, for a programmer, the most logical choices.

This is not programming, this is prose. Different domains.

I've tried to come up with a workable, coherent, parsable interpretation of
what people do anyway. Like a text version of speech or handwriting
recognition. You can only get people to alter or constrain their speech or
handwriting patterns so far before they refuse to use the software (notable
exception: Palm's Graffiti). Same with writing styles. StructuredText was
beyond my usability threshold; reStructuredText is usable.

> 2b: Structural Changes Difficult?
> 2c: Indention-Capable Software Not Available?

Enough oration! I get and accept the point. Again, please see my proposal
post.

(Do you realize that all this is in response to my original 35 lines in
problems.txt? It's a 20-to-1 ratio! Include the 60 lines in the spec itself
and it's still over 7-to-1. Obviously a serious concern! ;-)

> Moreover, if indention is only available in
> "relatively advanced text editors", as David observes,
> then, please! where is the editor, apart from Emacs,
> that supports the proposed table format?

As I wrote at the beginning of the spec, "Less often-used constructs and
extensions to the basic reStructuredText syntax may have more elaborate or
explicit markup." Tables are a less-often-used construct, requiring more
elaborate markup by their very nature. I came up with the syntax before I
knew the Emacs table mode existed (I did a search in hopes it existed,
*after* I came up with the syntax, and was pleasantly surprised).
reStructuredText tables look like tables; that's one of the goals of the
markup.

Tables have nothing to do with 'indented sections' vs. 'significant title
adornment sections'.

> According to David's proposal, docstrings would suffer
> from the same lack of sound generalization, with all
> difficulties, as HTML documents.

I'm not alone in refusing to use XML or YAML in docstrings. These kinds of
markup are absolutely regular, but regular markup is unwieldy for authors to
use. reStructuredText walks a very fine line; it's an attempt to find an
ideal markup balance between regular and freeform.

This is an argument that has been raised and shot down many times before.

> One More Remark, A Caveat And Conclusion
> 
> Apart from the treatment of indention in the proposal, I
> also have some doubts abouts the fitness of the proposed
> markups for definition lists (number 8 of the proposal)
> and literal blocks (number 9).

The numbers are not referring to the proposal itself
(http://structuredtext.sf.net/spec/reStructuredText.txt), but to the
analysis of StructuredText's problems
(http://structuredtext.sf.net/spec/problems.txt).

> In the first case, the
> proposed markup appears somehow too volatile to me, in the
> second, it is quite arbitrary.

Definition list syntax: why volatile? It uses indentation! ;-)

Literal blocks: the syntax comes from StructuredText, and it works.

If they are unsatisfactory, please suggest alternatives (other than the
following).

> Again, wouldn't we be better off with markup to signify
> 'commands' or 'role indicators'?

That's been proposed and shot down before. However, reStructuredText *does*
take exactly this approach with syntax extensions ('directives') and inline
markup ('interpreted text'): explicit command/role names. But those are
extensions, uncommon exceptions. Common constructs have implicit markup.
reStructuredText is all about exploiting implicit markup to the fullest.

You're free to propose and implement a command-based markup. In fact, please
do! Less talk, more code! (Or at least a concrete proposal, please.)

> (BTW, is there a distinction between 'literal' segments
> and 'code' segments? That would be important for coloring
> and formatting).

There are literal blocks and Doctest blocks. The Doctest blocks are Python,
and could be syntax-colored, but they are interactive Python sessions and
not always applicable to showing code excerpts. Literal blocks *could* be
examined for code content, and processed accordingly; that's a tools issue,
not markup. Perhaps someone could suggest a syntax for Python blocks, where
syntax coloring is the aim? Until there's an acceptable syntax, using a
'python' directive would suffice.

> Now for the caveat announced above. Yes, the proposal is
> right, those underlines do somehow stick out, I admit
> that. But, isn't that more appropriately effected with a
> line of dashes in a comment within the docstring?

Yuck! I've included comments along with directives in reStructuredText
because I know there will be a need for them, but to *force* authors to use
comments? No thanks. And if you're saying, "use a comment if you *want* an
underline, otherwise indentation alone is good enough", I disagree strongly.
See my proposal post.

> Since lines of dashes and the like would then be free
> again I suggest that a concrete markup is used for
> horzontal rulers.

HRs would not be inconsistent with the rest of reStructuredText, nor
difficult to parse. They are, however, purely presentational and
HTML-centric. I'll add them to the "Notes" file, but not to the spec. What
do others think?

> Concluding, I urge everybody not to abolish indention.
> That wouldn't be very Python, I'm afraid.

If anybody wants to propose and implement an input parser that indicates
structure purely with indentation (like YAML), please go ahead. Personally,
*I* don't believe in it, and *I* wouldn't use it. But mine is just one
voice.

Thank you, Wolfgang, for the feedback. It's very useful in refining one's
ideas. Next time, though, perhaps less personal invective?

Coming up next: the proposal...

-- 
David Goodger    dgoodger@bigfoot.com    Open-source projects:
 - Python Docstring Processing System: http://docstring.sf.net
 - reStructuredText: http://structuredtext.sf.net
 - The Go Tools Project: http://gotools.sf.net