[Doc-SIG] docstring grammar

Guido van Rossum guido@CNRI.Reston.VA.US
Fri, 03 Dec 1999 08:52:02 -0500


> That's why gendoc has a switch to be able to either parse
> the module or import it. Note that imports are the only way
> to extract information from C extensions.

Hm...  C extensions are also the most dangerous (in some cases) to
import, and further more this restricts you to generating
documentiation for modules that actually work on your current
platform.  Not a good idea.

> Perhaps there is a way to only extract class/function/method
> __doc__ strings from pyc-modules without actually running them,
> since those are really our only targets.
> 
> [looks at some module code objects...]
> 
> It wasn't obvious from the code objects I just looked at,
> but there could be way... after all the information must hidden
> somewhere between those bytes codes ;-)

Quite easily:

    & python
    Python 1.5.2+ (#929, Aug  4 1999, 13:59:33)  [GCC 2.8.1] on sunos5
    Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
    [startup.py ...]
    [startup.py done]
    >>> import string
    >>> fn = string.__file__
    >>> fn
    '/usr/local/lib/python1.5/string.pyc'
    >>> import marshal
    >>> f = open(fn, "rb")
    >>> f.seek(8)
    >>> c = marshal.load(f)
    >>> f.close()
    >>> c
    <code object ? at 104a60, file "/usr/local/lib/python1.5/string.py", line 0>
    >>> dir(c)
    ['co_argcount', 'co_code', 'co_consts', 'co_filename', 'co_firstlineno', 'co_flags', 'co_lnotab', 'co_name', 'co_names', 'co_nlocals', 'co_stacksize', 'co_varnames']
    >>> print c.co_consts[0]
    Common string manipulations.

    Public module variables:

    whitespace -- a string containing all characters considered whitespace
    lowercase -- a string containing all characters considered lowercase letters
    uppercase -- a string containing all characters considered uppercase letters
    letters -- a string containing all characters considered letters
    digits -- a string containing all characters considered decimal digits
    hexdigits -- a string containing all characters considered hexadecimal digits
    octdigits -- a string containing all characters considered octal digits


    >>> codes = filter(lambda x: type(x).__name__ == "code", c.co_consts)
    >>> for x in codes: print x.co_consts[0]; print "-"*20

    lower(s) -> string

	    Return a copy of the string s converted to lowercase.


    --------------------
    (etc.)

--Guido van Rossum (home page: http://www.python.org/~guido/)