[Doc-SIG] Doc-String Syntax

Moshe Zadka Moshe Zadka <mzadka@geocities.com>
Fri, 4 Feb 2000 02:14:16 +0200 (IST)


On Thu, 3 Feb 2000, Greg Ward wrote:

> On 03 February 2000, Moshe Zadka said:
> > There are two kinds of tags: "short" tags and "long" tags.
> 
> Good distinction.  Common markup -- emphasis, code snippets -- must be
> dead easy to type.

Except I'm still having trouble with code snippets -- mainly the escaping
thing you noted.

Let me just note one thing here about schedule of suggestions:
Today I'll write up --

1. The complete inline syntax
2. The complete intermediary format (this will be in XML)

However, the OOL syntax, which will include "links" which are snarfed
from a module documentation and is transformed to the intermediary format
will wait a bit. SGML experts are hereby invited to mail me about
minimization in SGML and how hard it is to support. The two minimizations
which bother me most is <this>test</this> and <this/test/. 

...
<Greg Ward does actual user testing too see which characters are harder
to type>
> First, I was wrong; square versus angle brackets make very little
> difference to typability.  

Hmmm...Greg, I see Randy has really gotten to you. Shouldn't we just meet
at noon for a flamewar and settle it like real internetters? <0.8 wink>

> However, for readability I like seeing the
> tag separated from the text it tags, ie. I prefer "emph<foo>" to "[emph
> foo]" not because of the shape of the brackets, but because the tag is
> outside of the brackets,

Fair enough

> So if you're really keen on square brackets, how about this:
> 
>   It must be easy to emph[emphasise] certain words, and to mark others
>   as code[code] (or even whole code[code snippets]).

The only problem is that it takes to much look ahead from a human to parse
it. It's all right for one letters, which is what POD uses, but is hardly
good enough for words. Let me just note my initial suggestion was
[emph= like this] or [emph: like this]. I'm still debating. I'll have
an answer ready this afternoon.

> Oh, here's another thing to consider regarding the shape of brackets:
> what kind of bracket is more likely to occur inside typical Python code
> snippets?  Ie. are you more likely to write
> 
>   ... code[foo[0]] is always an integer ...
> 
> or
> 
>   ... if code[x < 0], the function returns ...
> 
> ?  (Here, whether the tag is inside or outside the brackets is
> irrelevant.  How the particular bracket used is escaped does matter,
> though!)

Amazingly enough, I did consider that. Note that in most Python code
snippets, [ and ] are matched and < and > are not, so for most Python 
code snippets, the rule, "continue until matching ], but treat this
as raw text" works wonderfully, hence it is less important how I escape
it. BTW: my escape char will probably be @, because it isn't used in 
Python code, and is already used as special character by texinfo, so it
won't be *that* novell.

> > Long tags have the syntax exemplified by
> > 
> > arg name=s type=string::
> > 	string to be parsed
> 
> I'm confused: is this a syntax diagram or an example?  I'm guessing the
> latter

And you're right. Bad example on my part, I just realize on rereading this
that it *can* go both way.

return type=string::
	concatanation of all argment strings

is probably a clearer example.

> except in Python we (so far) need a way to specify the argument type in
> the documentation, since it's not in the code.  Is my interpretation
> correct?

Yes. Not that the type will stay a human-only string for a long time.
Oh, the joys of quasy-defined interfaces like "file-like".

> Looks reasonable.  I assume the idea is that the text following
> "example::" will be set in a fixed-width font, indented and vertically
> separated from the surrounding text?  How do you know when the example
> ends?  (I know -- use "]]>" as the end marker!  [Just kidding!])

The indentation stops.

Here's a snippet from a future docstring (Guido was kind enough to let me
use his time machine)
'''
Let us consider this example:

example::
	a, b = 1, 1L
	while 1:
		print a
		a, b = b, a+b

This example runs [emph forever], computing the fibonacci function.
'''

> Yuck.  Inline code should be marked up inline.  See my code[...] or
> code<...> examples above.

code will be special cased to be both long and short.

> > Paragraphs are seperated by new lines.
> 
> You mean blank lines (/\n{2,}/), I hope...

Sorry, my bad.

> > Another long tag, which is only valid in a classe's docstring, are
> > 'instance-attrs' (unlike 'data', which would be class attributes).
> 
> If we call them "class attributes" colloquially, why not call them
> "class attributes" in the documentation?  Also, may I tentatively
> suggest just "attribute" for instance attributes, because they are far
> more common than class attributes?

Coherency.

Look at this example module:

'''
data name=factor:
	the factor to multiply our gurkles by.
'''

factor = 100

def eggs():
	'''return a well known string'''
	return 'eggs'

class spam:

	'''
	data name=gurkle:
		just a simple gurkle

	instance-data name=name:
		the instance's name
	'''

	def camelot(self):
		'''return the name many times'''
		return self.gurkle*factor*self.name

This way every namespace has data, which are things without their own
doc-strings defined in the namespace. instance-data is a bit weird anyway
in current day Python, since it is technically *not* a feature of the
class: every instance can have different attributes, so I have no problems
special casing the weird case.

> Good.  That means just about the only use for the code[] tag would be,
> well, code (as opposed to names).  Err, what about variables (including
> function parameters): is there special markup for them, or would you
> just use code[]?

There will be a special markup. code[] is just for that: (more or less)
valid snippets of Python.

More to come later!
--
Moshe Zadka <mzadka@geocities.com>. 
INTERNET: Learn what you know.
Share what you don't.