[Doc-SIG] Docstring Standards

David Goodger goodger@users.sourceforge.net
Tue, 01 Oct 2002 00:25:43 -0400


Mahrt, Dallas wrote:
> Background: I am in the process of defining some internal
> programming standards for my company. One aspect we are keenly
> interested in is defining the docstring syntax such that we can
> facilitate documentation generation (similar to Doxygen or JavaDoc
> which we use currently) I have read about the docutils project and
> feel it is probably the best fit (in the long term) for our needs,
> however I have a question.

"In the long term" is important here, because (as you know from our
previous correspondence) Docutils doesn't have docstring extraction,
**yet**.  It will, and hopefully soon, but that depends on the time
and effort of volunteers.  (Not just me, hopefully.  It looks like
Richard Jones is dabbling in the pysource sandbox today, which is
encouraging.)

> 1) Method signatures.
> In Doxygen and JavaDoc, there is an explicit ''@param'' syntax for
> defining the documentation related to a parameter. There are similar
> constructs for ''@exception'' and ''@return''.

What JavaDoc has done is establish a syntax that enables a certain
documentation methodology, or standard *semantics*.  JavaDoc is not just
syntax; it prescribes a methodology.  I began a to
document some ideas about semantics, available here:

    http://docutils.sf.net/spec/semantics.html

I haven't explored documentation methodology more because, in my
opinion, it is a completely separate issue from syntax, and it's even
more controversial than syntax.  Nobody wants to be told how to lay
out their documentation, a la JavaDoc.  I think the JavaDoc way is
butt-ugly, but it *is* an established standard for the Java world.
Any standard documentation methodology has to be formal enough to be
useful but remain light enough to be usable.  If the methodology is
too strict, too heavy, or too ugly, many/most will not want to use it.

One thing I've experimented with is expressed in the above document
thus:

    Use field lists or definition lists for "tagged blocks".

By this I mean that field lists can be used similarly to JavaDoc's
@tag syntax.  That's actually one of the motivators behind field
lists.  For example, we could have::

    """
    :Parameters:
        - `lines`: a list of one-line strings without newlines.
        - `until_blank`: Stop collecting at the first blank line if
          true (1).
        - `strip_indent`: Strip common leading indent if true (1,
          default).

    :Return:
        - a list of indented lines with mininum indent removed;
        - the amount of the indent;
        - whether or not the block finished with a blank line or at
          the end of `lines`.
    """

In fact, this is taken straight out of docutils/statemachine.py, in
which I experimented with a simple documentation methodology.  Another
variation I've thought of exploits the Grouch_-compatible "classifier"
element of definition lists.  For example::

    :Parameters:
        `lines` : [string]
            List of one-line strings without newlines.
        `until_blank` : boolean
            Stop collecting at the first blank line if true (1).
        `strip_indent` : boolean
            Strip common leading indent if true (1, default).

.. _Grouch: http://www.mems-exchange.org/software/grouch/

Field lists could even be used in a one-to-one correspondence with
JavaDoc @tags, although I don't know if I'd recommend it.  The entire
question of methodology requires more serious thought than I can
afford at present.  I think a standard methodology would benefit the
Python community, but it would be a hard sell.  A PEP would be the
place to start.

> The only thing similar I have found in docutils documentations is
> the use of explicit roles. Ex.
> 
>     def foo(bar):
>         """This is foo.
> 
>         :parameter"`foo` - This is a foo
>         """

I think this should be::

    :parameter:`foo`

I don't think I'd use that syntax (interpreted text with explicit
roles) because it's very verbose and cumbersome.  Interpreted text has
syntax in reStructuredText but it hasn't really been implemented yet,
and may be rethought if something better shows up.  It's there in
anticipation of future need (yes, I know, not the XP way), especially
for Python docstring extraction.

> Is this the *standard* way of documenting parameters?

No.

> If not is there a standard?

No.  There have been attempts though.  Several ports of JavaDoc's @tag
methodology exist in Python, most recently Ed Loper's "epydoc_".
There's Frederic Giacometti's `iPhrase Python documentation
conventions`_.  I'm sure there've been others.

.. _epydoc: http://epydoc.sf.net/
.. _iPhrase Python documentation conventions:
   http://mail.python.org/pipermail/doc-sig/2001-May/001840.html

> If so is there a similar concept for raised exceptions and return
> values? 

Easy enough to do.  See the docstring sample above.

> The only thing I've noticed in practice are English descriptions with
> literal references. Ex.
> 
>     def foo(bar):
>         """This is foo.
> 
>         Passes `foo` which is a foo.
>         """
> 
> This seems to be more difficult to extract the description from the
> identifier for a more tabular representation (like JavaDoc)

Agreed.  However, for most human-readable documentation needs, the
free-form text approach is adequate.  You'd only need a formal
methodology if you want to extract the parameters into a data
dictionary, index, or summary of some kind.

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/