[Doc-SIG] docstring grammar

M.-A. Lemburg mal@lemburg.com
Tue, 30 Nov 1999 17:58:36 +0100


Edward Welbourne wrote:
> 
> > Thus:
> > * Any line starting with a word followed by a colon can be considered
> > a keyword.  If you dont want this, just make sure its not the first
> > word on the line.
> 
> Not happy.  A paragraph of text which precedes an example may be relied
> upon to end in `for example:', in which the last contiguous block of
> non-space characters is of length 8; if I modify an earlier part of the
> paragraph, I'm going to ask my authoring tool (python-mode.el) to
> reformat the paragraph, without necessarily being aware of a gotcha
> waiting for me at the paragraph's end; my margins will be within 72
> characters of one another, giving a roughly 1 in 9 chance that
> `example:' ends up being alone on the last line ... gotcha.
> 
> A cure for this would just be to do keyword-recognition case
> sensitively, and Capitalise keywords; otherwise, we have to insist on
> either a dedent or a blank line preceding any keyword.  Which offends
> folk worse: case sensitivity or needing a dedent/vspace ?

Why not just raise an exception ? I don't think that the
usage of "some text:" is common in doc strings except for
maybe examples which should then adapted to use the new
"Example:" keyword.

Here's an example docstring... the format looks pretty nice,
IMHO.

"""
foo(bar,rab,oof) -> integer -- single line desription

Longer description spanning
multiple lines

Arguments:
    bar -- some string
    rab -- another string
    oof -- an integer       

Returns:
    42 in most cases

History:
    19991130 MAL -- Added oof argument
    19991101 MAL -- Created

"""

Not sure if this is already somewhere in the proposal, but
I would like to see '--' as indicator of a single line
text block. This would be useful in vertically compressing
the docstrings somewhat (and it already being used in the
signature line for such a purpose).
 
> > * A star or dash starting a line can be considered a new list item.
> > Again, if it is truly a hyphen or whatever else, just adjust your line
> > wrap slightly so it is no longer the first word.
> 
> Alternatively, all lists use the same `item-introducer' character and
> follow it with an optional character indicating what bullet to use.
> Thus one might have (taking ~ as the introducer for the illustration)
> 
> ...

Let's leave this to some list parser (are we starting to head
for NP-completeness again ;-).

> > Other random thoughts:
> > * The [blah] notation is good, but needs to be well defined.  eg,
> > "[module.function]" when used in the context of a package should use
> > the same "module scoping" that Python itself uses.

Right. It should ideally perform the same lookup as Python would
in the global namespace. The resulting object could then either
be handled recursively by the doc tool or simply stored by reference
for later use (e.g. via the file name of a module or the id of an
object).
 
> The thing that saves [this] from being problematic is that the format in
> which it was introduced presumed that one was going to use a brief
> mnemonic as [this] word and end the docstring with a chunk which
> explains the cross-references (new keyword: Xrefs ?) and, in particular,
> tells the doc-string-reader which [tokens] actually have a translation,
> the rest being left as typed; thus, if this paragraph appeared in a
> docstring which says how to translate [this] (giving an xref and -
> optionally - a text to use (default `this') in place of [this]), the
> digested form would duly replace [this] but leave [tokens] as it is.
> 
> To further simplify life, I'd understood the [this] keys that are
> translatable to insist on [nowhitespace] to save the parser most of its
> `this might be an xref' pending decisions - which is why the Xrefs
> section needs to at least have the option of specifying the text to be
> used in place of [this] as well as the Xref to point it at.  What we're
> doing is citation, which is widely done with [].
> 
> No need for [this] to be a [module.function] or anything like - the
> Xrefs section provides the translation.
> 
> Xrefs:
>    [gendoc] http://www.python.org/contrib/gendoc/
>    [this] http://www.python.org/lists/doc-sig/hideous?with=data&as=you+will The present message
>    [copy] string.copy the standard string copy function
>    [etc] location sub sti tute
> 
> [sorry, all exhibited xrefs are bogus - illustrative only]
> I'm sure that's only a minor paraphrase of a spec I saw a while ago on
> this list ...
> 
> Of course, Xrefs might better be called Bibliography.

Or perhaps "References:" as in David's proposal ?!

> We can use as `location' some pythonic reference that can be resolved in
> the ways that the suggested module.function semantics point to: indeed,
> I would take this as what to try first, falling back on recognising
> other stuff as URLs and similar.
> 
> > ... However, the use
> > of brackets may conflict with people who use inline code (rather than
> > an example "block" - maybe something like "@" could be used?
> > @module.function@ would be reasonable.
> 
> With the above, can we evade this ?
> The fact that [citations] are so widely used argues for the [form]; and
> the fact that [anything with space in it] isn't a citation should make
> all the `ordinary text' and `python denotations' [usages] unproblematic,
> while leaving untranslated ones as [literal] uses of [ and ].  If
> nothing else, I find my eye latches onto [cite] better than @cite@ ...
> and bear in mind that @ has some other magic uses,
> 
> parser error - unclosed citation at line 137:
>       Sender: eddyw@lsl.co.uk
> 
> All told, we seem to have a fairly good spec ... save for some
> nitpickery ;^>

Since [] is only used for lists in Python, we could
define the RE '\[[a-zA-Z0-9_.]+\]' for our purposes and
raise an exception in case the enclosed reference cannot
be mapped to a symbol in the global namespace (note: no
whitespace, no commas) which either evaluates to a function,
method, module or reference object.

Doc strings like "...use [None]*10 as argument..." will fail,
but are easily avoided by inserting some extra whitespace, e.g.
"...use [ None ] * 10 as argument...".

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    31 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/