[Doc-SIG] Structured Text

Goodger, David dgoodger@atsautomation.com
Tue, 6 Mar 2001 14:39:40 -0500


Hi Edward,

As Tibbs mentioned (thanks for the plugs, Tony! ;-), I wrote some articles
in November on the very subject you're researching. You can find the
articles here:

  - A Plan for Structured Text
    http://mail.python.org/pipermail/doc-sig/2000-November/001239.html

  - Problems With StructuredText
    http://mail.python.org/pipermail/doc-sig/2000-November/001240.html

  - reStructuredText: Revised Structured Text Specification
    http://mail.python.org/pipermail/doc-sig/2000-November/001241.html

Unfortunately, since then I haven't had much time to work on the issues. I
have refined some of the issues in my head, but not on "paper". Right now my
computer at home is still in a box a week after we moved, six weeks after we
began a major renovation project in our new house. The renovations continue,
and take priority, so the computer will remain boxed for a few more days at
least. I am writing from work, therefore I must be brief.

> I'm not sure if this is the correct
> forum for such questions.

I know of no better forum. Another place might be the StructuredTextNG ZWiki
or mailing lists over on Zope's pages. In addition, I'd highly recommend
scanning over the archives of Doc-SIG, since these issues have come up many
times in the past. Most of the principles have given up and moved on to less
controversial pastures.

> 1. Does every string value have an interpretation as a Structured
>    Text?
...
> 2. If it is true that every string value has an interpretation as a
>    Structued Text, does it make sense to officially "discourage"
>    certain types of strings [...] ?

One way you can think of a StructuredText interpreter is like a computer
language compiler/interpreter: illegal input generates warnings (if you're
lucky :) and errors. I like Tibbs' reply to this one. Ideally, a tool would
make a best guess and generate warnings, without crapping out (unless
explicitly told to do so).

> 3. Which types of "code coloring" (emph, inline, etc.) can "wrap" over
>    lines, and which can't?  E.g., can I have an *emph statement that
>    continues to the next line?*

I don't see why not.

> 4. Is there any official precedance ordering on the different 
> types of 
>    "code coloring?"  Will there be anytime soon?  Any rules 
> about what 
>    types of code coloring can be contained in what other types?

Nothing official other than the existing codebase, which is not complete.
See my block diagram in "reStructuredText: Revised Structured Text
Specification" for my take on the (high-level) hierarchy.

> 5. Does structural formatting or code coloring take precedance?  For
>    example, if a paragraph starts with "* foo *," will it be a normal
>    paragraph with an emphasized first element, or a list 
> item?  (It'll 
>    be much easier for me to write formal rules if structure takes
>    precedence. ;) )

I think structure has to take precedence. You've provided one of an infinite
number of conceivable edge-cases where it'll be tricky to get a program to
process its input correctly 100% of the time.

> 6. Among the list types, which take precedence?  For example, if a
>    paragraph starts with "1. foo -- bar", is it an ordered list item
>    or a descriptive list item?

I would consider it an ordered list item *containing* a descriptive list
item. But I'm certain others would differ.

> 7. What is meant by saying that SGML text passes through?  SGML isn't
>    even a mark-up language, so I assume that the intent is something
>    like "XML and HTML text passes through."  But does that mean that
>    in an expression like '<TAG>a*b*</TAG>', the '*'s will be ignored?
>    That seems unreasonably difficult to implement.  What about an
>    expression like '<T a="*x*"/>'?  Does this mean I can't say things
>    like if 'x<y *and* y>z'?  Is there strong support for the
>    notion of letting "SGML" text pass through, or is it something that
>    might be dropped?  (I would certainly vote for dropping it. :) )

SGML *is* a markup language (that's what the ML stands for), but it's a
meta-markup language. XML is too. Only HTML (of the three) is a specific
markup language. What they meant was simply that "<TAGS>text like
this</TAGS>" would pass through. This is one place where the original
StructuredText definition is sorely lacking, IMHO.

Hopefully I can get the renovations done before too long and get back to
more cerebral pursuits.

David Goodger
Systems Administrator & Programmer, Advanced Systems
Automation Tooling Systems Inc., Automation Systems Division
direct: (519) 653-4483 ext. 7121    fax: (519) 650-6695
e-mail: dgoodger@atsautomation.com