[Python-Dev] PEP 287: reStructuredText Standard Docstring Format

David Goodger goodger@users.sourceforge.net
Fri, 05 Apr 2002 22:10:57 -0500


Guido van Rossum wrote:
> But if you ask me "should we use this for the standard library" I
> think I'll have to say no.

If the question is "Should we convert all library documentation and
stuff it into docstrings in the source?", I would agree with you
wholeheartedly.  There are some people who would like it that way, but
I'm not one of them.

People seem to have gotten the impression that I'm advocating taking
the library docs out of LaTeX and stuffing them into module
docstrings, for later extraction & processing.  That is *not* the
case!  I don't know *how* anyone got that idea!  <ahem> <ahem>  The
docs are safe from me. ;-)

> Given this status quo, docstrings in the Python standard library
> should not try to duplicate the library reference documentation;
> instead, they should be no more than concise hints.

I agree completely.

> For such docstrings, a markup language, even reStructuredText, is
> inappropriate.
> 
> IOW, reStructuredText is not even an option for new standard library
> modules.

What about existing docstrings?  There is plenty of informal markup in
there already.

For the standard library I would suggest that, once the tools are up
to it (e.g., once there's reStructuredText support in pydoc),
*existing* docstrings could be *minimally* converted to formalize the
implicit markup that's *already there*.  For example, here's the
module docstring for the string.py module:

"""A collection of string operations (most are no longer used in
Python 1.6).

Warning: most of the code you see here isn't normally used nowadays.
With Python 1.6, many of these functions are implemented as methods on
the standard string object. They used to be implemented by a built-in
module called strop, but strop is now obsolete itself.

Public module variables:

whitespace -- a string containing all characters considered whitespace
lowercase -- a string containing all characters considered lowercase l
uppercase -- a string containing all characters considered uppercase l
letters -- a string containing all characters considered letters
digits -- a string containing all characters considered decimal digits
hexdigits -- a string containing all characters considered hexadecimal
octdigits -- a string containing all characters considered octal digit
punctuation -- a string containing all characters considered punctuati
printable -- a string containing all characters considered printable

"""

(I wrapped the first two paragraphs and truncated the list so email
wouldn't wreck it.)

As it stands, this is almost valid reStructuredText (strictly speaking
it is already valid, but the list would get wrapped and wouldn't be
very useful).  The list of variables needs a bit of work; it could be
turned into a bullet list or a definition list.  The variable
identifiers themselves could be marked up as "interpreted text"
(e.g. rendered in a different face, with links to each identifier's
docstring if it exists).  The warning could be left as-is, or spruced
up.  Here is the fully converted docstring:

"""A collection of string operations (most are no longer used in
Python 1.6).

.. Warning:: most of the code you see here isn't normally used
   nowadays.  With Python 1.6, many of these functions are implemented
   as methods on the standard string object. They used to be
   implemented by a built-in module called strop, but strop is now
   obsolete itself.

Public module variables:

`whitespace`
    a string containing all characters considered whitespace
`lowercase`
    a string containing all characters considered lowercase letters
`uppercase`
    a string containing all characters considered uppercase letters
`letters`
    a string containing all characters considered letters
`digits`
    a string containing all characters considered decimal digits
`hexdigits`
    a string containing all characters considered hexadecimal digits
`octdigits`
    a string containing all characters considered octal digits
`punctuation`
    a string containing all characters considered punctuation
`printable`
    a string containing all characters considered printable

"""

The conversion is minimal (it could be even less), it's still
perfectly readable, and the difference in the converted output is
significant.  Please take a look at the converted output (1 or 2) and
compare to the output for vanilla pydoc (3).

1. http://structuredtext.sf.net/spec/string.html
2. http://structuredtext.sf.net/spec/string2.html (bullet list instead
   of definition list)
3. the first section of http://web.pydoc.org/2.2/string.html

(Note that the HTML uses a CSS1 stylesheet, so a recent browser is
required.  A writer for HTML for older browsers is on the to-do list.)

In any case, nothing needs to be done any time soon.  What do you
think?

> I agree with Jeremy that the PEP needs to be clear and explicit
> about this.

Will do.

-- 
David Goodger    goodger@users.sourceforge.net    Open-source projects:
 - Python Docstring Processing System: http://docstring.sourceforge.net
 - reStructuredText: http://structuredtext.sourceforge.net
 - The Go Tools Project: http://gotools.sourceforge.net