[Doc-SIG] Re: directives and fields

Edward D. Loper edloper@gradient.cis.upenn.edu
Fri, 20 Apr 2001 00:17:09 EDT


[These responses are a bit out-of-order.. They're in the order
 that I felt like responding to them in.]

[David said:]
> Please use a different name for what you're doing and let's be done
> with it. Lots of room for competition (the field's wide open right
> now!  ;-). The more the merrier.

I don't think I've used any name related to ST to refer to the markup
language I'm talking about for two or three weeks (ever since we
decided that we didn't need to try to maintain compatibility with ST).
If you like, we can call "my" language "epytext," because that's what
I called the parser module I've been writing (edloper's version of
pytext).

But really, I'm not just trying to design a markup language that *I*
like.  If that were my goal, I'd write a parser and be done with it.
My goal is to produce a markup language for docstrings that the Python
community can embrace as a whole.  You may say that it's not going to
happen, but at least 2 languages (Perl and Java) have managed it, and
I don't see why we can't come up with a good standard ML for Python.

Of course, not everyone would use the same features of the markup
langauge as everyone else.  Some people might use emph inline markup,
some might not; some might use fields, and some might not.  But they
would all be using the same markup language..  Just like I can write
in LaTeX and decide not to use \emph{}.  And as a result, people can
write tools for the markup language.

>>> "raw text should be as readable as possible, even to the
>>> uninitiated"
>
> I'd say, for the Setext/StructuredText approach, it *is* the most
> fundamental goal. If it's not yours, you'll save yourself a lot of
> grief by using XML or TeX.

In designing a good markup language for docstrings, I think we really
need to balance a number of goals.  XML and TeX do well with some
goals (e.g., they're formal, and XML is simple)..  But not so well
with other goals (they're not very easy to write, and not easy for the
uninitiated to read).  I think it's dangerous to concentrate too much
on any one goal.

> Then implement a POD-like language or a JavaDoc-like language or
> whatever.  This is clearly the dividing line: do you "buy in" to the
> Setext/StructuredText concept or not?

Do I have to "buy in" to all of it?  For example, to things like
saying that "*" is an asterisk if it appears once in a paragraph, but
an emph delimiter if it appears twice?  I appreciate many of the
features of ST-like languages.  I think that there's great potential
for clean/simple structuring, using them.  I think there's good
potential for simple colorizing, as long as we restrict it so that
it's "safe."  I think that we could potentially use one of those
without using the other.

> Again, read through the archives. Everyone has different opinions,
> everyone wants different levels of control. If you don't want to use
> a particular feature, don't. But someone else does. Please don't
> limit *me*.

But constructing a standard embraced by the community is really all
about limiting *you* (the user).  Without limitations on the user, we
can't write compatible tools.

One option, if you like it, is to say that any paragraph starting with
".. " will generate an error unless it starts a directive that a
parser knows about...  And anyone who uses directives should know that
they are making their docstrings less standard and less portable
across tools..  And, perhaps, "standard" directives can be added
to the language as time goes on, which will *not* result in less
standard/portable docstrings.

> That's a big enough domain with enough controversy to make the
> feature necessary. See the archives. See this discussion! :-) It's
> been going on for years, you know.

I know it has.  I thought maybe we could end it.  :) But if you manage
to convince me otherwise, I guess I *will* go off and write my own
parser/docstring tools. ;)

> > I think that external URL
> > hyperlinks should be implemented with colorizing, if at all.
> 
> They're definitely required. I used readability as the overriding
> criterion in making that decision. Which is more readable?

I would argue for either::

  I love using the Python programming langauge (http://www.python.org).

or::

  I love using the Python programming language (U{http:/www.python.org}).

...  But I know you'll disagree. :)

> It is my opinion that incomplete, minimal markup schemes are doomed
> to failure, because *your* minimal set of features doesn't match
> *my* set or *anybody else's*. At least at the discussion level. ;-)

I was trying to base my minimal set on previous successful docstring
markup langagues (POD and JavaDoc)..  If you think we need to add more
features, then we should talk about what features to add.  But only if
we're still trying to work towards coming up with a "community
standard" markup language (i.e., something we can put in a PEP).
Otherwise, we might as well just go off and implement our own little
markup languages. :)  But at the end of the day, (perhaps I should say
end of the year? ;), I would like to have a simple, streight-forward,
*bounded* markup language.

> - whose first line begins with '.. ' in column 1,
> - whose second and subsequent lines are indented relative to the first, and
> - which ends with a blank or unindented line.

Hm..  My bad.  I skipped past the "Comments and Directives" section to
the "Directives" subsection.  From the example in that subsection
(which was presumably not correctly formatted), I assumed that the
second and subsequent lines didn't need to be indented::

    .. keywords::
    Author: Anne Elk (Miss)
    Revision: 1

So I guess we basically agree.  Is it ok with you to change that to
"and ends with an unindented line" for now (in our discussion of
directives)?

> There are essentially two types of directives: extensions, which
> apply to their blocks only; and plugins, which may change the
> behaviour of the parser for some defined part of the input.

I would like to *only* allow "extensions."  If we allow "plugins,"
then a parser that doesn't recognize a directive really has no choice
but to fail.  I really don't see the need for plugins..  

One advantage of just using fields is that we can deal with unknown
fields in a reasonable way: put thier contents in a section labeled
with the name of the field.

-Edward