[Doc-SIG] looking for prior art

David Goodger goodger@python.org
Fri, 06 Dec 2002 21:47:58 -0500


Doug Hellmann wrote:
> Does compiler include comments?  I had to write a separate parser to
> pull comments out.

As Michael said, no.  That's another reason for using compiler and
tokenize in parallel.

>> The Docutils Python reader component will transform this AST into a
>> Python-specific doctree, and then a `stylist transform`_ would
>> further transform it into a generic doctree.  Namespaces will have
>> to be compiled for each of the scopes, but I'm not certain at what
>> stage of processing.
> 
> Why perform all of those transformations?  Why not go from the AST
> to a generic doctree?  Or, even from the AST to the final output?

I want the docutils.readers.python.moduleparser.parse_module()
function to produce a standard documentation-oriented AST that can be
used by any tool.  We can develop it together without having to
compromise on the rest of our design (i.e., HappyDoc doesn't have to
be made to work like Docutils, and vice-versa).  It would be a
higher-level version of what compiler.py provides.

The Python reader component transforms this generic AST into a
Python-specific doctree (it knows about modules, classes, functions,
etc.), but this is specific to Docutils and cannot be used by HappyDoc
or others.  The stylist transform does the final layout, converting
Python-specific structures ("class" sections, etc.) into a generic
doctree using primitives (tables, sections, lists, etc.).  This
generic doctree does *not* know about Python structures any more.  The
advantage is that this doctree can be handed off to any of the output
writers to create any output format we like.

The latter two transforms are separate because I want to be able to
have multiple independent layout styles (multiple runtime-selectable
"stylist transforms").  Each of the existing tools (HappyDoc, pydoc,
epydoc, Crystal, etc.) has its own fixed format.  I personally don't
like the tables-based format produced by these tools, and I'd like to
be able to customize the format easily.  That's the goal of stylist
transforms, which are independent from the Reader component itself.
One stylist transform could produce HappyDoc-like output, another
could produce output similar to module docs in the Python library
reference manual, and so on.

It's for exactly this reason:

>> It's very important to keep all docstring processing out of this,
>> so that it's a completely generic and not tool-specific.

... but it goes past docstring processing.  It's also important to
keep style decisions and tool-specific data transforms out of this
module parser.

-- 
David Goodger  <goodger@python.org>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/