[Doc-SIG] docstring grammar

Tony J Ibbs (Tibs) tony@lsl.co.uk
Tue, 30 Nov 1999 10:31:43 -0000


All of the following are minor nit-pickings, because it all looks VERY GOOD.
(Personally, I'm not too worried about the tool as-such, I just want the
grammar defined so I can use it!).

David Ascher wrote:
> I forgot two markups:  *this* is bold and _this_ is italic.  Bold and
> italic markups must begin and end within a paragraph (I'd say 'within a
> sentence' but I don't want to complicate the parser with a sentence type).
> No space allowed between *'s and _'s and their contents.

And I hope it's also possible to nest them arbitrarily, with some "sensible"
effect (yes, this *is* useful in english text, and I would not want to lose
it in documentation!). [Technically, that's a viewer problem, but I want the
grammar to *say* this can be done, so the software writers have an onus on
them to cope with it.]

Marc-Andre Lemburg wrote:
> I'd suggest using '^ *[a-zA-Z_]+[a-zA-Z_0-9]*: *' as RE for
> keywords, i.e. keywords are Python identifiers immediatly followed
> by a colon starting a line of a doc string. That should avoid
> most complications, I guess.

Sounds sensible to me - the advantages outweigh the disadvantages.

On Tim Peters' test texts - I think this is actually an important enough
idea that it might warrant its own keyword - perhaps "TestScript" (no, I
know that's clumsy) - thus giving subliminal encouragement to the concept
(hmm - must use it someday, he said guiltily). This would also allow us to
distinguish odd chunks of code which are NOT test scripts (a new ability,
since at the moment the tester will try to use all >>> text?), which I think
could sometimes be useful...

David Ascher wrote:
> How about another keyword?
>
>  List:
>     * foo
>     * bar
>     * spam

I would vote against that, firstly on the grounds that it doesn't read well,
and secondly that it is probably the sort of thing that people wouldn't do
(!). As with what others think, I believe we can hack lists without the
keyword (is this now the consensus?).

In another message, David continued:
> I propose that part of the definition of a keyword is (along with any
> special parsing rules) whether it can be duplicated in a docstring.

Hmm - then I think we're going to need some serious support in "The Standard
Editors" to give a hint about whether something can be included more than
once, since I have a sneaky feeling we're getting quite a lot of keywords
(is it about 7 things that humans remember easily?). On the other hand,
modulo the clever peoples' time, I rate that as "not a problem".

NB: how picky is the tool going to be about getting the indentation exactly
right? I'm not fussed by it being very picky, but I know I'm odd that way.

David Hammond votes for doing lists by detecting the bullets (good), but I'd
like to reserve more than two characters (hyphen and asterisk are OK, but I
do sometimes use 3 level lists, and would like another one - on the other
hand, I'm not sure what other than @ and he wants that for something else...
hmm - if we're not worried by hyphen confusing us with negative numbers,
maybe plus would be sensible).

I also tend to agree with Davids Hammond and Ascher that [ and ] are very
valuable AS TEXT. The use of @..@ is visually very obvious to me, which is
presumably a good thing in context, so I also vote for that (gosh, I've just
voted positively for something delimited by the same character at start and
end - obviously the start of the slippery road to hell).

Whilst I don't know owt about parsing (well, more precisely, parse trees
scare me), I don't see any of the proposals so far as giving any great
problems with extracting information from the text.

David (Ascher) - is it time to re-release your initial "docstring grammar"
email with the comments you're happy with edited in? I *really* don't have
time to do it, or I already would...

Tibs
--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.demon.co.uk/
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)
[I've read it twice. I've thought it over. I'm sending it anyway.]