[Doc-SIG] docstring grammar

Tony J Ibbs (Tibs) tony@lsl.co.uk
Mon, 29 Nov 1999 09:50:49 -0000


I would *love* to see a standard for doc strings, and although I've often
objected to specific proposals in the past, by now I'd take almost anything.
Well, no, that's NEVER true, but David's proposal doesn't cause *too* many
knee-jerk reactions...

David Ascher wrote:
>   Paragraphs are separated by one or more blank lines.

As you say later on, I think this does cause some over-use of whitespace...

>   Characters between # signs and the end of the line are stripped by
>   the docstring parser.

This is a Bad Thing - I have quite often needed to discuss things in doc
strings which include use of the "#" character - not least if I'm parsing a
little language that uses "#" as its comment character! So losing stuff thus
would be difficult. Either (a) why do we need comments in doc strings, or
(b) provide a way to escape the "#" character.

(Also, if one were using Tim Peter's "test using the doc string as template"
thingy, one needs to be able to put generic Python code in the doc strings,
and that means that stopping comment characters from going through to the
ultimate documentation may be a bad thing.)

>   A 'keyword-tagged block' is nested much like Python code.  Just like
>   in Python, the block can either be on the same line as the keyword
>   if it is one-line long

I *like* this.

>       Contributors::               # The value is a block of lines.
>
>           John Doe
>
>           Ronald Reagan
>
>           Francois Mitterand

but the above gets oververbose. I suppose one could instead use a list
syntax:

	Contributors::
		- John Doe
		- Ronald Reagan
		- Francois Mitterand

since I don't see the ambiguity in allowing the omission of the vertical
whitespace here, *if* one allows that some care would be needed with
hyphenation! (i.e., one can't allow one's hyphens to start a line, which is
awkward but probably not too bad). Another possibility might be to allow
"Python list" syntax - I started off disliking this, but over the last few
minutes it has grown on me:

	Contributors::
		[ John Doe,
		  Ronald Reagan,
		  Francois Mitterand ]

(again, highjacking Python's syntax).

>   Text blocks can be followed by indented blocks as well -- those are
>   'children' blocks of the outdented block.

And this solves my "I want a list item to have multiple paragraphs" problem,
which
has been a bugbear of mine in the past with other proposals... The exact
indentation of a second paragraph in a list item (whether aligned with the
bullet or the text) would need addressing later, but I don't much care
(provided it is with the text, of course).

>   'text' blocks which start with * or - are tagged as 'bullet items'
>   for rendering.  The bullet marker has to be consistent within a
>   given level of indentation.
>
>     Example:
>
>        * this is one bullet
>
>           - this is a sub-bullet
>
>           - this is another sub-bullet
>
>        * this is another bullet

Again, sometimes I'd like to allow the blank lines to be missing. Another
way to do this is to have a "special" character to introduce the bullet
items - so maybe instead:

	Example:
		@* this is one bullet
		   @- this is a sub-bullet

but that's horrible in its own way - maybe the white space is just what we
have to live with (I certainly WOULD live with it if it was the only thing
standing in the way of adopting the proposal!).

No, on thinking about it, I would vote for either:

	1) use of white space as David proposes
	   (pro: utter simplicity,
	    con: doesn't quite look as nice as I'd like)
	2) allow Python list syntax
	   (pro: emphasises this is for short lists,
	    con: a bit odd)
	3) detect bullet characters at the "start of line"
	   (pro: still fairly simple,
	    con: one has to take care about, e.g., dashes in text)
	   Ah - I just realised that negative numbers at the start of a line
	   probably kill that one...

Could we do numbered/lettered/named lists by, for instance:

	*1 This list item is numbered, and one expects all items
	   at this indentation in this list to be numbered

	   -a Ditto for "lettered" items in this list

	       @fred   And this sub-list has item names

         -2 This may well get flagged as a mistake

	*B Unless we're allowing the author to do odd things
	   if they like...

(is that simple enough?)

>   Is there value in having string interpolation?  David Arnold mentioned
>
>        __version__ = "$Revision$[11:-2]
>        __date__ = "$Date$

There's also a semi-convention I've seen where a module's doc string is also
used as its documentation for Unix commands, and one substitutes in
sys.argv[0] - i.e., the command used to invoke the script - as a string into
the "Usage:" line. It's a rather hacky trick, and perhaps not to worry about
too much.

> The sharper-eyed will note that I stacked the
> deck in my favor in the above proposal by including what Guido does
> naturally as valid in the proposed grammar.

Yea, go for it!

desparately hoping this will get off the ground, but with no time to do
anything more than comment on it, Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.demon.co.uk/
2 wheels good + 2 wheels good = 4 wheels good?
3 wheels good + 2 wheels good = 5 wheels better?
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)