[Doc-SIG] docstring grammar

Tim Peters tim_one@email.msn.com
Wed, 1 Dec 1999 17:40:26 -0500


[David Ascher]
> The only question I suppose is whether one should require a
> keyword (Test: or other) to keep the top-level syntax trivial, or
> special-case the recognition of >>>-beginning paragraphs.
>
> I'm leaning for the former, as it can evolve to the latter if there
> is sufficient call for it from the user base, and I think it does
> keep the code simpler.  But I'm willing to be swayed.

[Tony J Ibbs (Tibs)]
> No - keep the keyword. My reasoning is (a) I like it [emotional
> reaction, which is the real reason (parse that as "it feels more
> elegant")],

Note that I'm not asking to get rid of the keyword ("Test: or other" -- btw,
the very fact that David can't think of a compelling name is the very reason
">>>" is so highly desirable:  the latter is the only choice that isn't
fabricated out of thin air -- ">>>" is *natural*).  Use a keyword if you
like -- doctest doesn't care, so long as it finds ">>>" sooner or later.

> and (b) I still have the feeling that on occasion I might want non-
> test Python script in there,

It's certainly odd that people who don't use doctest are suddenly worried
about how to stop it from testing their code <wink/frown> -- doctest doesn't
run tests unless you run doctest.  If you do run doctest and have Python
script you don't want tested, simply refrain from starting it with ">>>"!
For example, doctest won't touch

    Example:
        m = MyClass(4, "red")
        assert len(m.color()) != m.int()

I'm not advocating that *all* code examples start with ">>>", just that
">>>" be accepted as one of the ways of introducing an example.  I have (at
least) hundreds of code examples already in that format, and they already
look nice and work great (people recognize them instantly for what they are,
and create their own with ease).

> and (c) it *is* a 'logical' subdivision of the text in exactly the
> same way as the other major divisions, and so deserves its own place.

Ah -- I don't view it as a major division at all.  So far as a doc parser is
concerned, at the "major" level a code block should be a single token (it
has no internal structure of interest).  I'd say it's much less complication
than the baroque proposed rules for recognizing bulleted lists, but is of
the same nature:  "if a line begins with such-and-such a sequence of
characters, interpret it as meaning so-and-so".

Looking at it from that view, the requirement that I write my doctest
examples as:

Test:
   >>> x + 1
   3

instead of as

>>> x + 1
3

is like requiring that everyone write:

Unordered-List:
    List-Item:
        First point.
    List-Item:
        Second point.

instead of as e.g.

+ First point.
+ Second point.

Since I have 100x more doctest examples in my modules than bulleted lists of
any flavor, the idea that the latter should be made especially easy but the
former made artificially clumsy does tend to grate <wink>.

>(to reply to Robin Friedrich later on - although I also don't understand
> his point about more tags making it harder to parse things (unless he
> means "for humans to parse")).

As well as for humans to write and to remember.

> Am I allowed to disagree with Tim Peters

Certainly!

>> You'll end up recognizing that with a regexp, like
>>
>>    r"^\s*Example:\s*"

> No! No! Whilst I realise that any General Purpose, Released With
> Python tool will probably have to use re 'cos that's all it has,
> *I* (for one) would never end up recognising anything much with
> a regexp. Follow the One True Way - convert to mxTextTools (gosh,
> I feel better now).

I didn't mean to proselytize on that issue one way or the other.
Recognizing ">>>" is near-trivial with mxTextTools too, or even with
string.find -- I'm trying to introduce <wink> some sanity against the notion
that ">>>" is some kind of *burden* for a programmed parser to recognize.
It's not:  it's a fixed string that's extremely unlikely to appear by
accident, and by that measure is less a headache than list-item prefixes.

> ...
> because I have a sneaky feeling Tim's doc-code-tester *wants* to test
> all code given as examples to make sure they all work (or "fail in the
> right way"). Hmm.

doctest tests all and only stuff it finds in ">>>" blocks, and I've never
seen a ">>>" block in a docstring *unless* it was put there specifically for
doctest to find.  People writing "plain old" (not-to-be tested) examples
simply don't paste interactive sessions into their docstrings, so there's no
">>>", so doctest leaves their examples alone.  Instead they mix prose with
inline code fragments that fail to work as advertised 3 hours after the docs
are written <0.8 wink>.

Changing what doctest does isn't an option here:  in practice, it's proved
to be an essentially perfect solution to the problems it tried to address,
and part of "perfection" was making it dirt simple enough that even
sub-average programmers can and do use it successfully within minutes of
downloading the pkg.  I'm not mucking with the hard-won qualities that made
this possible!  doctest will continue to work fine no matter what we do
about doc markup; the only question I have here is whether Doc-SIG markup
will play nice with existing and future doctest-using modules.

the-difference-is-about-one-line-of-code<0.5-wink>-ly y'rs  - tim