[Doc-SIG] Doc-String Syntax

Greg Ward gward@cnri.reston.va.us
Thu, 3 Feb 2000 09:21:54 -0500


On 03 February 2000, Moshe Zadka said:
> There are two kinds of tags: "short" tags and "long" tags.

Good distinction.  Common markup -- emphasis, code snippets -- must be
dead easy to type.

> Short tags have the syntax exemplified by [emph this is emphasized] (the
> first word is the tag, and it continues to a nested bracket. 
> Rational: this way tags can be written without touching the "shift" keys
> on most keyboards. 

Well, if you're going to get so anal-retentive as to worry about the
"difficulty" of using shift keys, then I will retaliate in kind and
insist that angle brackets are easier to type than square brackets:
square brackets either require use of the right-hand pinky, or a
right-hand jump to use the bigger fingers.  For me, angle brackets are
closer to the middle finger, so I find them easier to type.

A usability experiment: type the same chunk of text using three possible
syntaxes; we're trying to test both typability and readability.  First,
straight POD as documented in the "perlpod" man page:

  It must be easy to B<emphasise> certain words, and to mark others as
  C<code> (or even whole C<code snippets>).

Similar, but a little more verbose and explicit (mix POD with Moshe's
idea):

  It must be easy to emph<emphasise> certain words, and to mark others
  as code<code> (or even whole code<code snippets>).

Finally, Moshe's syntax:

  It must be easy to [emph emphasise] certain words, and to mark others
  as [code code] (or even whole [code code snippets]).

First, I was wrong; square versus angle brackets make very little
difference to typability.  However, for readability I like seeing the
tag separated from the text it tags, ie. I prefer "emph<foo>" to "[emph
foo]" not because of the shape of the brackets, but because the tag is
outside of the brackets,

So if you're really keen on square brackets, how about this:

  It must be easy to emph[emphasise] certain words, and to mark others
  as code[code] (or even whole code[code snippets]).

Oh, here's another thing to consider regarding the shape of brackets:
what kind of bracket is more likely to occur inside typical Python code
snippets?  Ie. are you more likely to write

  ... code[foo[0]] is always an integer ...

or

  ... if code[x < 0], the function returns ...

?  (Here, whether the tag is inside or outside the brackets is
irrelevant.  How the particular bracket used is escaped does matter,
though!)

> Long tags have the syntax exemplified by
> 
> arg name=s type=string::
> 	string to be parsed

I'm confused: is this a syntax diagram or an example?  I'm guessing the
latter: you're talking about a long tag called "arg", which takes "name"
and "type" arguments, and is followed by text describing the argument.
In other words, this is similar to the Javadoc

   @param s string to be parsed

except in Python we (so far) need a way to specify the argument type in
the documentation, since it's not in the code.  Is my interpretation
correct?

> The special long tag 'example' will not be intepreted by the program, 
> so 
> 
> example::
> 	['this', 'is', 'a', 'list']
> 
> will work with no problems

Looks reasonable.  I assume the idea is that the text following
"example::" will be set in a fixed-width font, indented and vertically
separated from the surrounding text?  How do you know when the example
ends?  (I know -- use "]]>" as the end marker!  [Just kidding!])

> the special long tag 'code' will work the same way, but will be translated
> to inline code:
> 
> you can also write
> code::
> 	(lambda x: [x])(5)
> to return a list containing 5.

Yuck.  Inline code should be marked up inline.  See my code[...] or
code<...> examples above.

> Paragraphs are seperated by new lines.

You mean blank lines (/\n{2,}/), I hope...

> Another long tag, which is only valid in a classe's docstring, are
> 'instance-attrs' (unlike 'data', which would be class attributes).

If we call them "class attributes" colloquially, why not call them
"class attributes" in the documentation?  Also, may I tentatively
suggest just "attribute" for instance attributes, because they are far
more common than class attributes?

> All of the current semantic markup would be moved to short tags: 
> [module urllib], [class URLOpener] and [function urlopen] are all
> examples.

Good.  That means just about the only use for the code[] tag would be,
well, code (as opposed to names).  Err, what about variables (including
function parameters): is there special markup for them, or would you
just use code[]?

        Greg
-- 
Greg Ward - software developer                    gward@cnri.reston.va.us
Corporation for National Research Initiatives    
1895 Preston White Drive                           voice: +1-703-620-8990
Reston, Virginia, USA  20191-5434                    fax: +1-703-620-0913