[Doc-SIG] A promise

Fred L. Drake, Jr. fdrake@acm.org
Mon, 27 Nov 2000 12:49:34 -0500 (EST)


Laurence Tratt writes:
 > However, when the grammar changes other things tend to change too,
 > the sorts of changes that might well affect other parts of the
 > docutils system. I may be being a little too paranoid here... But
 > there again I wouldn't want to be the poor git maintaining parsers
 > for Python 1.5.2, 1.6, 2.0, Jython 1.0, Jython 1.1 etc... That's
 > just multiplying the number of chances for things to go seriously
 > wrong.

  What my (still unpublished) code does is use a generic tree matcher
with some variables specified in the pattern, and return the matching
tree and a dictionary mapping the variables to the subtrees that
matched.  I only needed a few patterns; we don't care about most of
the Python grammar, only enough to get docstrings out.

 > I'll put my neck on the line: long term, tracking the current
 > Python release (even if not using the built in parser interface) is
 > the way to go. In the short term, you might get away with coping

  This is certainly the way to gain support for the current version.
I don't think there's likely to be much difficulty with extracting
most interesting information; the hardest part will be pulling apart
things like parameter lists, which have changed between 1.5.2 and
2.0.  There are semantic issues with that sort of information.  But
that's pulling a lot of things out beyond docstrings.  (Though I think
*that's* pretty valuable.)

 > with multiple versions but if eg the type/class dichotomy is solved
 > (did I see a PEP for that? Can't remember), then that might have
 > ramifications beyond the grammar probably ruining easily maintained
 > multiple version support.

  I don't think solving the type/class dichotomy has grammar
implications.

 > If you only use the tokenize module, you effectively have to write
 > your own grammar (be that for a parser system or implicitly in

  Yes, but if we're only extracting docstrings, the grammar can be
dirt simple.  The difficulties really only appear when pulling out
more detailed information than module/class structures and method
names/docstrings.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations