[Doc-SIG] formalizing StructuredText

Edward D. Loper edloper@gradient.cis.upenn.edu
Thu, 22 Mar 2001 08:35:35 EST


> > > 3. Local references (which look like '[this]' or '[1]') are now
> > > supported. 

So, if I understand correctly, this ST::

    This is a [test]
    ..[test] of local references

Would be rendered in HTML as::

    <p>
        This is a <a href="#test">[test]</a>
    </p>
    <p>
        <a name="test">[test]</a> of local references
    </p>

?  I'm not sure how you'd render it in LaTeX..

Can anchors appear anywhere in the document?
Do they have to be their own paragraphs?
Can anchors be treated as footnotes (e.g., by LaTeX)?
What can their contents be?  E.g., can they contain a list item::

    ..[test] * This is a list item

? A list::

    ..[test]
        * Item1
        * Item2

? etc.

> it means a user can regard::
> 
> 	[This] is a local reference
> 
> and::
> 
> 	"This":#this is a local reference
> 
> as the same, which isn't much use *within* a document, but is *very*
> useful for allowing links from outside.

Are we expecting people to *want* to link into a document from 
outside?  I can't see ever having any use for that when writing
API docs...

> A tool like TeX would need some untangling of the
> '#this' to just 'this' for use in its '\xref', but that's hardly
> difficult.

Hm.. maybe I just don't know enough LaTeX. :)

[about handling multi-paragraph list items]
>>   1. some text
>>
>>      some more text
> and this gets "flattened" to be::
> 
> 	<oitem>
> 	<para>

I would argue that it would be more appropriate to use::
    <oitem>
        <bullet>1.</bullet>
        <para>some text</para>
        <para>some more text</para>
    </oitem>

Also, what would your "flattening" do with::

    1. some text

       some more text

           even more indented

> (if we had::
> 
> 	This is a paragraph.
> 
> 	   And so is this.
> 
> then the flattening phase would say to itself "aha - a paragraph within
> a paragraph - presumably the user *meant* something by that", and in
> this case it would produce::
> 
> 	<para>
> 	<block>
> 	   <para>

Can these nest arbitrarily deeply, if they keep indenting?

> > > 5. The RE used for detecting URLs has become more
> > > sophisticated. There are some associated rules
> >
> > Hm.. I don't look forward to formalizing this, and trying to get STNG
> > to agree with your regexps :)
> 
> STNG has its own REs. They don't make much sense to me (or didn't last
> time I looked at them). In some cases, they just didn't work very well.
> Oh well.

Well, then, we should convince them to change them! :)

> But I don't see why *formalising* it is a problem?

It's just nice to have formalisms that don't contain big 
difficult-to-explain regular expressions.  It makes the 
formalism harder to understand.

> > Note also that it should be possible to generate the "long RE
> > expression" in a *principled* way, given a formalization, so that
> > it will detect *all* errors, not just *common* errors.
> 
> This I don't understand - I'm not sure what you mean by "in a principled
> way", and I'm also not sure what you mean by "all errors, not just
> common errors".
> But this will doubtless become clearer to me as STminus progresses (I
> begin to suspect you may regret that name some day, as it becomes more
> capable and more clearly sufficient-to-itself).

Anything whose meaning is not defined by the formalism is invalid.  It
should be possible for a user to ask a tool to tell them if they use
any invalid forms -- that way, they are guaranteed that what they
have written will be interpreted as specified by the *formalism*,
regardless of which implementation/tool they happen to be using (unless
that tool has a bug).  And given a formalism, it is possible to detect
invalid forms in a principled way.

-Edward