From jbkerr@sr.hp.com  Sat Nov  6 01:02:47 1999
From: jbkerr@sr.hp.com (James Kerr)
Date: Fri, 5 Nov 1999 17:02:47 -0800 (PST)
Subject: [Doc-SIG] a wish list
Message-ID: <199911060102.RAA02700@joplin.sr.hp.com>

  I've been following the messages in this group for a few weeks
now. It seems as though a lot of attention is being focused on 
the details of the markup language. This is certainly an important
issue, but maybe the *how* of the markup language will be easier
to decide once the *why* is understood.

  For whatever it's worth, I've outlined the ways I would like to
use Python documentation in an ideal world. These items are listed
from most important to least important. Maybe this will provide some
hints about the best way to implement markup.

  BTW, I don't know how much of this has been implemented in the
latest version of IDLE, so some of what follows may already be old
hat.  Also, I haven't much bothered to distinguish between usage in
a GUI environment, vs. an Emacs-like environment that is text-based
but mouse-aware, vs. a command-line environment.

* Search documentation by keyword or phrase, in any combination
  of Library Reference, FAQ, module descriptions, index, etc. The
  focus is on getting a quick answer to a specific question, not on
  broad topical navigation.

  Example: I'm a recent Python convert, I'm typing in my first script,
  and I want quick answers to questions like
      - how do I access command-line options?
      - how do I convert a string to an int?
      - what is the list of built-in operators?

  Preferred solution: type queries like these into a dialog box and
  get references to specific sections of documentation.

  Acceptable alternatives:
      - consult a permuted index
      - do a keyword query instead of a free-form query

  Comments:
    If a semantically-based search algorithm is too hard to write, a
    really good permuted index might be useful. All one-line class and
    function summaries could be part of this index, as could FAQ
    entries, annotations in the library reference, etc. All it would
    take is a small army of volunteers to go through the documentation
    and insert index entries at all relevant locations.

* View all available classes, either alphabetically or hierarchically.

  Preferred solution: A presentation that used indentation to show
  parent/child relationships, with links from class names to both
  documentation and source code. Some allowances would have to be
  made for multiple inheritance.

  Other niceities (available only in a GUI environment):
    - pause the cursor over a class reference, and see a
      1-line summary of the class in a popup window or status line.
    - expand a class (via some kind of mouse or keyboard operation),
      and see a list of all functions defined in the class.
    - pause over a function reference, and see a summary of
      the function in a popup or status line.
    - jump to class or function documentation (or source code)
      from the listing.

* View all the methods that are available in a class.

  This is a little different from the previous item, since inheritance
  allows you to call functions that are defined in a superclass. 

  Preferred solution: both of the following ~
    - a per-class view, that displays the inheritance tree from the
      selected class on upward. The functions defined in each class
      are displayed along with the class.
    - a summary view, that just shows the class you're interested in,
      and uses some kind of annotation to distinguish which methods
      are defined in that class, and which are defined in superclasses.
  
  Acceptable alternative: either of the above. 

* Maintain a list of bookmarks.

  Preferred solution: an easy-to-use tool that allows you to
    - bookmark locations precisely (i.e. down to line-in-a-document
      precision).
    - attach symbolic names to groups of bookmarks (for example,
      "Object Database Support")
    - adjust your bookmarks transparently when new documentation
      rolls out.

  Acceptable alternative: devise a scheme that allows a user to
  define his/her own bookmark files with a text editor.

  Comments:
    I often find myself using a small portion of online docs quite
    heavily when doing a project. It's kind of a pain to have to jump
    back and forth between documents (or parts of documents) to get
    information.  The ability to provide some kind of personalized
    view of documentation would be a real plus.

  
  Hope these challenges aren't too trivial to be interesting ;-)
In all seriousness, I think it's great that so much attention is
being given to the documentation effort, because it really could be
central to the success of Python.

-Jim

-- 
Jim Kerr
Agilent Technologies
1400 Fountaingrove Pkwy, MS 3USZ
Santa Rosa, CA 95403


From mhammond@skippinet.com.au  Sat Nov  6 02:25:33 1999
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Sat, 6 Nov 1999 13:25:33 +1100
Subject: [Doc-SIG] a wish list
In-Reply-To: <199911060102.RAA02700@joplin.sr.hp.com>
Message-ID: <004a01bf27fe$346405d0$0501a8c0@bobcat>

James writes:
>   For whatever it's worth, I've outlined the ways I would like to
> use Python documentation in an ideal world. These items are listed
> from most important to least important. Maybe this will provide some
> hints about the best way to implement markup.

I cant argue with anything you have written - although what Python
needs (and this effort is no different) are less good ideas, and more
code.

Im afraid there is no army of coders just sitting around waiting for
good ideas to implement.  Why not tackle even one of your wishlist
yourself, and present some code?  Then any issues you have with the
current markup scheme are more likely to be listened to, as we then
have _something_ to help put it into context, rather than an abstract
concept about what is may happen to be good if anything is ever
implemented...

Mark.


From irmina@ctv.es  Sat Nov  6 14:07:40 1999
From: irmina@ctv.es (Manuel Gutierrez Algaba)
Date: Sat, 6 Nov 1999 14:07:40 +0000 (GMT)
Subject: [Doc-SIG] a wish list
In-Reply-To: <004a01bf27fe$346405d0$0501a8c0@bobcat>
Message-ID: <Pine.LNX.3.95.991106135617.4151A-100000@localhost>

On Sat, 6 Nov 1999, Mark Hammond wrote:

> James writes:
> >   For whatever it's worth, I've outlined the ways I would like to
> > use Python documentation in an ideal world. These items are listed
> > from most important to least important. Maybe this will provide some
> > hints about the best way to implement markup.
> 
> I cant argue with anything you have written - although what Python
> needs (and this effort is no different) are less good ideas, and more
> code.
> 
> Im afraid there is no army of coders just sitting around waiting for
> good ideas to implement.  Why not tackle even one of your wishlist
> yourself, and present some code?  Then any issues you have with the
> current markup scheme are more likely to be listened to, as we then
> have _something_ to help put it into context, rather than an abstract
> concept about what is may happen to be good if anything is ever
> implemented...
> 
I'm aware of the need for comprehensive information and search
engines, and I do know that *FEW* people code get involved in 
large projects that help the others. Because of that I guessed
a method of producing lots of information with little effort.
A bit of markup is all.
I'm doing this for TeX, and in python. 
In http://www.ctv.es/USERS/irmina/TeEncontreX.html
( the project is GPL and sources are supplied in Sources!)
It could be done similarly for python.
In fact I posted in Comp.lang.python. But nobody replied. The more
I live in Internet, the surer I'm of three facts:
- People rarely get involved
- People who gets involved usually involve in stupid projects 
(irc-bots, ftp clients,... always the same)
- Interesting people who do interesting projects do so big things
that It's difficult for the others to follow. Moreover interesting/
capable people doesn't get involved in esay things. 

I'm 100% sure that my method of storing information is quite valuable,
easy,... But regretfully it seems not to be interesting to anybody.


Regards/Saludos
Manolo
-------------
My addresses / mis direcciones: 
a="www.ctv.es/USERS/irmina"
b=[("Lritaunas Peki Project", ""),
   ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ),
   ("page of drawing utility for tex ", "/texpython.htm" ),
   ("CrossWordsLand","/cruo/cruo.html")
   ]
for i in b:
  print i[0],":", a+i[1]

  Let us not look back in anger or forward in fear, but around us in awareness. -- James Thurber


From Manuel Gutierrez Algaba <irmina@ctv.es>  Sat Nov  6 14:46:20 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Sat, 6 Nov 1999 14:46:20 +0000 (GMT)
Subject: [Doc-SIG] a wish list Part II
In-Reply-To: <004a01bf27fe$346405d0$0501a8c0@bobcat>
Message-ID: <Pine.LNX.3.95.991106142457.714B-100000@localhost>

The method described in TeEncontreX is so extremely simple and 
flexible that:
a) You can take (rigth now) FAQ's and other document, 
attribute them just placing into them commands like
 \newcommand{\indexsockets}{\index{sockets}}...
b) You can mix FAQ's, article, example code... and even you
can distinguish them! Attributing:
\newcommand{\indexarticle}{\index{article}}
c) You can do that a kind of library of available rutines for 
python
d) You can extract data from .py. Ex:

class any_sockets_related_class:
    ...
    def cool_routine(self,...):
        """ \indexsockets \indexacoolthingX
        """

And parse it. 

Now, what we need for get the BEST and MOST powerful and cohesive
system of information is just  some hundreds of
 "ATTRIBUTERS"
(people who attribute ), no special need nor knoweledge is need.
And the effort is minimum ( write down three or four words,
here and there). It's a huge effort, but it's a linear effort and
simply straight forward.
Besides the attributed information can be rearraged many ways,
grouping the attributes... 

The more I think about it the more genial it seems to me. 

Just a question for the reader: Why is not implemented? 
The code is already done. It's simple, it's easy,it's high level...
It's a huge effort but, with 100 persons working at it, or 1000
persons working from time to time ( once a week , half an hour)
we could get thousand of attributed ( reusable, searchable, efficient...
) information. 
I, myself, have done a lot of in TeEncontreX, and It's me alone. 
One person 130 articles, 100 persons 13000 articles. And python
stuff is quite more readable that TeX stuff.

Regards/Saludos
Manolo
-------------
My addresses / mis direcciones: 
a="www.ctv.es/USERS/irmina"
b=[("Lritaunas Peki Project", ""),
   ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ),
   ("page of drawing utility for tex ", "/texpython.htm" ),
   ("CrossWordsLand","/cruo/cruo.html")
   ]
for i in b:
  print i[0],":", a+i[1]

  Waste not fresh tears over old griefs. -- Euripides


From fdrake@acm.org  Mon Nov  8 15:24:07 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 8 Nov 1999 10:24:07 -0500 (EST)
Subject: [Doc-SIG] a wish list
In-Reply-To: <004a01bf27fe$346405d0$0501a8c0@bobcat>
References: <199911060102.RAA02700@joplin.sr.hp.com>
 <004a01bf27fe$346405d0$0501a8c0@bobcat>
Message-ID: <14374.60183.988407.503816@weyr.cnri.reston.va.us>

Mark Hammond writes:
 > Im afraid there is no army of coders just sitting around waiting for
 > good ideas to implement.  Why not tackle even one of your wishlist

Mark,
  James had asked me what he could do to help, and his posting is part 
of that; I *specifically* asked for suggestions regarding how to use
the documentation to help me make sure I cover as many reasonable uses 
of the documentation as I can as I make the conversion to XML.  My
thought is that this is *the* time to make sure our markup carries
over all interesting information (all of it, right?), and is
sufficiently well-structured that we don't need to reformat the
documents to add additional information to the structure.
  Thanks, James!


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From fdrake@acm.org  Mon Nov  8 18:54:11 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 8 Nov 1999 13:54:11 -0500 (EST)
Subject: [Doc-SIG] a wish list
In-Reply-To: <Pine.LNX.3.95.991106135617.4151A-100000@localhost>
References: <004a01bf27fe$346405d0$0501a8c0@bobcat>
 <Pine.LNX.3.95.991106135617.4151A-100000@localhost>
Message-ID: <14375.7251.289446.900869@weyr.cnri.reston.va.us>

Manuel Gutierrez Algaba writes:
 > I'm 100% sure that my method of storing information is quite valuable,
 > easy,... But regretfully it seems not to be interesting to anybody.

Manuel,
  I did actually take a moment to look at it, but I wasn't really sure 
what you were doing.
  In response to your more recent note here, I took another look.  I
downloaded the complete Unix package, and I'm still not quite sure
what's going on.  (The code is hard to read for those of us who don't
know Spanish; sorry.)
  Can you explain precisely what you're advocating be done?  I'm sure
it can be explained on a Web page if you'd rather add it to your
TeEncontreX site for everyone who looks there, or in a message here,
whichever makes more sense for you.
  Thanks for your input!


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From mhammond@skippinet.com.au  Mon Nov  8 21:48:50 1999
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Tue, 9 Nov 1999 08:48:50 +1100
Subject: [Doc-SIG] a wish list
In-Reply-To: <14374.60183.988407.503816@weyr.cnri.reston.va.us>
Message-ID: <008f01bf2a33$0ba14400$0501a8c0@bobcat>

Ahh - OK - sorry about that.  Im afraid I simply assumed it was yet
another "I want", rather than an "I will" mail.  My apologies, and I
hope you get even a few of these ideas implemented.

Mark.

> Mark Hammond writes:
>  > Im afraid there is no army of coders just sitting around
> waiting for
>  > good ideas to implement.  Why not tackle even one of your
wishlist
>
> Mark,
>   James had asked me what he could do to help, and his
> posting is part
> of that; I *specifically* asked for suggestions regarding how to use
> the documentation to help me make sure I cover as many
> reasonable uses
> of the documentation as I can as I make the conversion to XML.  My
> thought is that this is *the* time to make sure our markup carries
> over all interesting information (all of it, right?), and is
> sufficiently well-structured that we don't need to reformat the
> documents to add additional information to the structure.
>   Thanks, James!
>
>
>   -Fred
>
> --
> Fred L. Drake, Jr.	     <fdrake@acm.org>
> Corporation for National Research Initiatives
>


From irmina@ctv.es  Tue Nov  9 00:54:32 1999
From: irmina@ctv.es (Manuel Gutierrez Algaba)
Date: Tue, 9 Nov 1999 00:54:32 +0000 (GMT)
Subject: [Doc-SIG] a wish list
In-Reply-To: <14375.7251.289446.900869@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.3.95.991108234126.13522A-100000@localhost>

On Mon, 8 Nov 1999, Fred L. Drake, Jr. wrote:

> Manuel,
>   I did actually take a moment to look at it, but I wasn't really sure 
> what you were doing.
>   In response to your more recent note here, I took another look.  I
> downloaded the complete Unix package, and I'm still not quite sure
> what's going on.  (The code is hard to read for those of us who don't
> know Spanish; sorry.)
>   Can you explain precisely what you're advocating be done?  I'm sure
> it can be explained on a Web page if you'd rather add it to your
> TeEncontreX site for everyone who looks there, or in a message here,
> whichever makes more sense for you.

Well, sorry, I was definetely sure that AnalizaToo.py was quite
readable... Anyway if you're  interested in, I can translate it.
Anyway, the core of my proposal is not the programm but the data.
The code is just a "formatter" of the data. All this stuff about
data has a heavy theoretical base.

Let's take a look at a typical article of Too.tex ( my database):
....
\jiji
 1. How do I change the section headings such that the section number
 does not appear in boldface? Or make the section number and
 the section header to be unbolded?
\jaja
\indexlayout \indexsection
titlesec.sty or sectsty.sty.
\jiji     
...
Consider \jiji as delimitors of an article, and \jaja delimitor
of parts of an article. 
\indexlayout is saying that the article is about "text layout".
As any concept may be related directly to many other concepts I
mark this relationship thus:
\newcommand{\indexlayout}{\index{layout}\index{decorations}
\index{adornos}}

So I basically attribute the articles with keywords. Another
example:
\jiji
\indexnumeracion
\indexalphanumeric
\indexfootnote
 Can footnote markers be something other than arabic
numerals?

\jaja
Yes, \renewcommand{\thefootnote}{\alph{footnote}}
  or \Alph, or \roman, or \Roman, or \fnsymbol
This is a general prescription for changing the formatting of one of
LaTeX's counters: you redefine \thecounter.

\jiji   

The idea of keywords is not new. But what's not so new, is what 
happens when we try to put many keywords in small pieces of information
and then we glue all that information. 
As almost any piece of information is very rich ( the example
above holds inf about numbering, footnotes and arabics), then
the reunion usually have many contact points, ie, imagine that many
articles may speak about footnotes (as a main subject or secondary).
So if we eventually want to information about footnotes, we'll
have a large collection of related stuff. And that collection
will be rather significant of the footnote itself. Let's take
a look of an example:
(sorry again, if you don't understand...
 Articulo = article, estos son los articulos disponibles
these are the available articles,
adornos = decoration = stuff to make things prettier )

 Estos son los articulos disponibles
   Articulo 11: adornos decorations secsty titlesec layout
   Articulo 32: hrule adornos decorations
   Articulo 39: altura decorations book.sty adornos baseline
   hbox height cuadro
   Articulo 48: adornos decorations rcs
   Articulo 72: adornos decorations final_linea end_of_line
   Articulo 73: adornos figure decorations
   Articulo 79: adornos margins decorations
   Articulo 112: space adornos decorations tabular
   Articulo 134: adornos decorations book cleardoublepage
   Articulo 138: adornos decorations caption
   Articulo 144: adornos decorations space textheight 

Just watching this, you can learn about the term "decoration".
It's something related to space, textheight, book.sty, hrule,
hbox... It'd may happen that you could get almost a definition
of it ! Well, this scheme lets you refine your search (imagine
if you are trying to get some kind of effect in LaTeX), just
browsing by the article that is closer to your wishes ( textheight,
titlesec), and to learn more about concepts related to "decoration".

But, this is just "one scheme" (the one provided by AnalizaToo.py).
 With
the very same Too.tex(database)
and if it were big enough you could say:
I'd like you to make a book about "decoration in LaTeX",and these
are the rules
1 I want to a description about general concepts
2 I want it from the more general to the specific item
3 The more general is "book","space",....

Please, remember than currently Too.tex is a collection of USENET/mailing
lists articles, but even so. You'd get a document, whose articles
would be "sorted" by your rules. Remember too, that we'd need 
an article labelled with general_concepts, and decoration.

Imagine, now python doc (that apparently is very different 
from USENET posts)(ref.tex):

...

It is also possible to create anonymous functions (functions not bound
to a name), for immediate use in expressions.  This uses lambda forms,
described in section \ref{lambda}.  Note that the lambda form is
merely a shorthand for a simplified function definition; a function
defined in a ``\keyword{def}'' statement can be passed around or
assigned to another name just like a function defined by a lambda
form.  The ``\keyword{def}'' form is actually more powerful since it
allows the execution of multiple statements.
\indexii{lambda}{form}     
...
Imagine that I attribute this with:
\indexanonymous \indexlambda \indexdefinition \indexdef

Imagine that I've attributed this too(tut.tex):

expression.  Semantically, they are just syntactic sugar for a normal
function definition.  Like nested function definitions, lambda forms
cannot reference variables from the containing scope, but this can be
overcome through the judicious use of default argument values, e.g.

\begin{verbatim}
def make_incrementor(n):
    return lambda x, incr=n: x+incr
\end{verbatim}        

\indexlambda \indexexample \indexvarscope

Well, simply with these attributions we can have ALL this combinations:
- If we want to search by lambda, we'd have:
lambda example varscope
anonymous lambda definition def

So  the reader would guess: ah an example of lambda, and the definition
of lambda!
- If we want to search by def, we'd have:
...
anonymous lambda definition def
....
...
So, he'd know all the ways of defining functions, def, lambda,
recursive, class messages...

So the information we wrote for 'lambda' would be reused for the 
people who wants information about defining information.

- If we want info about var scopes , then we have:
.....
lambda example varscope 
.....
Just imagine all the stuff of scopes here.

You can see easily that all the existing information could be put
from dozens different points of views, and that examples, USENET,
definitions, tutorials may cooperate to give a global complete 
information about any subject.

Now, let's think about XML, it basically reflects the structure
of information. Now I wonder if it'd be better to know what information
are we talking about, if we had 10000 chunks of attributed information
( whose value is great ), we could know which are the possible
structures (combinations of those chunks).

Sorry, if you expected a description of the algorithm (analizatoo.py),
quite "irrelevant" I think, here what really matters is information.

Anyway if you want further information or translation , just say it.

I can't think of any other easier, faster and more powerful way
for reusing existing "as is" information, as this.

Imagine a very large set of chunks of information, sharing and 
grouping ... Is that XML? Is better? Is XML a subset, a hard-wire
of certain scheme of some groups of chunks of information?
 
Regards/Saludos
Manolo
-------------
My addresses / mis direcciones: 
a="www.ctv.es/USERS/irmina"
b=[("Lritaunas Peki Project", ""),
   ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ),
   ("page of drawing utility for tex ", "/texpython.htm" ),
   ("CrossWordsLand","/cruo/cruo.html")
   ]
for i in b:
  print i[0],":", a+i[1]

  Reality is nothing but a collective hunch. -- Lily Tomlin


From david@hotjobs2000.com  Tue Nov  9 18:54:18 1999
From: david@hotjobs2000.com (David Winsen)
Date: Tue, 9 Nov 1999 10:54:18 -0800
Subject: [Doc-SIG] WEB DATABASE PROGRAMMER POSITION AVAILABLE
Message-ID: <000801bf2ae3$d455d180$2a2565d8@pacbell.net>

This is a multi-part message in MIME format.

------=_NextPart_000_0005_01BF2AA0.C56185E0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

=20
WEB DATABASE PROGRAMMER POSITION AVAILABLE
=20

URGENT MESSAGE!

=20

This e-mail is not intended to be un-solicited. We apologize if you =
didn't want to receive this e-mail. Please reply to be removed.=20

=20

From:  David Winsen - Senior Consultant - High Technology Executive =
Search=20

=20

We have an out-dated copy of your resume in our database or have viewed =
your credentials on the internet.  HTES is an established national =
Executive Search and Consulting Firm who has been serving the High Tech =
Industries for over 25 years. =20

=20

We have been confidentially retained by a Los Angeles, Ca. based Adult =
Internet Fulfillment/Billing Company. Salary is $50k-$100k DOE.=20

We are confidentially pre-screening top candidates for the following =
position: Web Database Programmer

            Description
          =20

    =20
      Candidates will need to be extremely detail oriented and have a =
solid work ethic. Will be developing cutting edge software and =
e-commerce applications. They are an established and ever growing =
Internet Fulfillment/Billing Company. They offer a casual & unique work =
environment unlike any other, full benefits, & room for growth and =
advancement.
    =20
      =20
     =20
    =20
            Requirements
          =20

    =20
      Candidates will need experience in Perl, Python, PHP, Javascript, =
UNIX, Linux or FreeBSD, MySQL a plus. BS Computer Science or equivalent. =
At least 3 years of Web experience is a plus.
    =20

=20

If you are interested, please E-mail me in MS Word 95-98 a recent copy =
of your resume and a cover letter with your specific information, =
including your recent compensation package to Position-for: Web Database =
Programmer

=20

My personal E-mail is david@hotjobs2000.com or fax your resume to (310) =
855-0840. If you have any questions about the position(s), please call =
me at (310) 855-0406 and I will discuss them in detail.

=20

We also have developed an interactive Website that you can view over =
6000 national openings www.hotjobs2000.com.  This system is effective, =
easy to use and new positions are posted daily.  We encourage you to use =
it and nominate yourself for other positions you feel you are qualified =
for.  We are looking forward to working with you now and in the future.=20

=20


------=_NextPart_000_0005_01BF2AA0.C56185E0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META content=3D"MSHTML 5.00.2314.1000" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>&nbsp;
<H1>WEB DATABASE PROGRAMMER POSITION AVAILABLE</H1>
<P class=3DMsoNormal>&nbsp;<?xml:namespace prefix =3D o ns =3D=20
"urn:schemas-microsoft-com:office:office" /><o:p></o:p></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-bidi-font-size: =
10.5pt; mso-bidi-font-family: Arial; mso-font-kerning: 1.0pt">URGENT=20
MESSAGE!<o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-bidi-font-size: =
10.5pt; mso-bidi-font-family: Arial; mso-font-kerning: =
1.0pt">&nbsp;<o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal">This e-mail is not intended to be =
un-solicited. We=20
apologize if you didn't want to receive this e-mail. Please reply to be =
removed.=20
</SPAN><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-font-kerning: =
1.0pt; mso-fareast-font-family: 'MS Mincho'"><o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-bidi-font-size: =
10.5pt; mso-bidi-font-family: Arial; mso-font-kerning: =
1.0pt">&nbsp;<o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-bidi-font-size: =
10.5pt; mso-bidi-font-family: Arial; mso-font-kerning: 1.0pt">From:<SPAN =

style=3D"mso-spacerun: yes">&nbsp; </SPAN>David Winsen - Senior =
Consultant - High=20
Technology Executive Search <o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-bidi-font-size: =
10.5pt; mso-bidi-font-family: Arial; mso-font-kerning: =
1.0pt">&nbsp;<o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-bidi-font-size: =
10.5pt; mso-bidi-font-family: Arial; mso-font-kerning: 1.0pt">We=20
have an out-dated copy of your resume in our database or have viewed =
your=20
credentials on the internet.<SPAN style=3D"mso-spacerun: yes">&nbsp; =
</SPAN>HTES=20
is an established national Executive Search and Consulting Firm who has =
been=20
serving the High Tech Industries for over 25 years.<SPAN=20
style=3D"mso-spacerun: yes">&nbsp; </SPAN><o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-bidi-font-size: =
10.5pt; mso-bidi-font-family: Arial; mso-font-kerning: =
1.0pt">&nbsp;<o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-bidi-font-size: =
10.5pt; mso-bidi-font-family: Arial; mso-font-kerning: 1.0pt">We=20
have been confidentially retained by a Los Angeles, Ca. based Adult =
Internet=20
Fulfillment/Billing Company. Salary is $50k-$100k DOE. =
<o:p></o:p></SPAN></P>
<P style=3D"MARGIN-RIGHT: 0.5in"><SPAN=20
style=3D"FONT-FAMILY: Arial; mso-bidi-font-size: 10.5pt; =
mso-bidi-font-family: 'Times New Roman'; mso-font-kerning: 1.0pt; =
mso-fareast-font-family: 'MS Mincho'">We=20
are confidentially pre-screening top candidates for the following =
position: <B=20
style=3D"mso-bidi-font-weight: normal">Web Database =
Programmer</B></SPAN><SPAN=20
style=3D"FONT-FAMILY: Arial; mso-bidi-font-size: 10.0pt; =
mso-bidi-font-family: 'Times New Roman'"><o:p></o:p></SPAN></P>
<TABLE border=3D0 cellPadding=3D0 cellSpacing=3D0=20
style=3D"WIDTH: 6.25in; mso-cellspacing: 0in; mso-padding-alt: 0in 0in =
0in 0in"=20
width=3D600>
  <TBODY>
  <TR>
    <TD colSpan=3D2=20
    style=3D"PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; =
PADDING-TOP: 0in">
      <TABLE border=3D0 cellPadding=3D0 cellSpacing=3D0=20
      style=3D"WIDTH: 100%; mso-cellspacing: 0in; mso-padding-alt: =
2.25pt 2.25pt 2.25pt 2.25pt"=20
      width=3D"100%">
        <TBODY>
        <TR>
          <TD=20
          style=3D"BACKGROUND: #8fa2ab; PADDING-BOTTOM: 2.25pt; =
PADDING-LEFT: 2.25pt; PADDING-RIGHT: 2.25pt; PADDING-TOP: 2.25pt">
            <P class=3DMsoNormal><SPAN=20
            style=3D"FONT-SIZE: 10pt; FONT-WEIGHT: normal; =
mso-bidi-font-family: Arial; mso-bidi-font-weight: =
bold">Description</SPAN><SPAN=20
            style=3D"COLOR: black; mso-bidi-font-size: =
12.0pt"><o:p></o:p></SPAN></P></TD></TR></TBODY></TABLE>
      <P class=3DMsoNormal><SPAN=20
      style=3D"COLOR: black; mso-bidi-font-size: =
12.0pt"><o:p></o:p></SPAN></P></TD></TR>
  <TR>
    <TD colSpan=3D2=20
    style=3D"PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; =
PADDING-TOP: 0in">
      <P class=3DMsoNormal><SPAN=20
      style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; =
mso-bidi-font-family: Arial">Candidates=20
      will need to be extremely detail oriented and have a solid work =
ethic.=20
      Will be developing cutting edge software and e-commerce =
applications. They=20
      are an established and ever growing Internet Fulfillment/Billing =
Company.=20
      They offer a casual &amp; unique work environment unlike any =
other, full=20
      benefits, &amp; room for growth and advancement.</SPAN><SPAN=20
      style=3D"COLOR: black; FONT-STYLE: normal; FONT-WEIGHT: normal; =
mso-bidi-font-size: 12.0pt"><o:p></o:p></SPAN></P></TD></TR>
  <TR>
    <TD=20
    style=3D"PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; =
PADDING-TOP: 0in">
      <P class=3DMsoNormal>&nbsp;<SPAN=20
      style=3D"COLOR: black; mso-bidi-font-size: =
12.0pt"><o:p></o:p></SPAN></P></TD>
    <TD=20
    style=3D"PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; =
PADDING-TOP: 0in">
      <P class=3DMsoNormal>&nbsp;<SPAN=20
      style=3D"FONT-SIZE: 10pt"><o:p></o:p></SPAN></P></TD></TR>
  <TR>
    <TD colSpan=3D2=20
    style=3D"PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; =
PADDING-TOP: 0in">
      <TABLE border=3D0 cellPadding=3D0 cellSpacing=3D0=20
      style=3D"WIDTH: 100%; mso-cellspacing: 0in; mso-padding-alt: =
2.25pt 2.25pt 2.25pt 2.25pt"=20
      width=3D"100%">
        <TBODY>
        <TR>
          <TD=20
          style=3D"BACKGROUND: #8fa2ab; PADDING-BOTTOM: 2.25pt; =
PADDING-LEFT: 2.25pt; PADDING-RIGHT: 2.25pt; PADDING-TOP: 2.25pt">
            <P class=3DMsoNormal><SPAN=20
            style=3D"FONT-SIZE: 10pt; FONT-WEIGHT: normal; =
mso-bidi-font-family: Arial; mso-bidi-font-weight: =
bold">Requirements</SPAN><SPAN=20
            style=3D"COLOR: black; mso-bidi-font-size: =
12.0pt"><o:p></o:p></SPAN></P></TD></TR></TBODY></TABLE>
      <P class=3DMsoNormal><SPAN=20
      style=3D"COLOR: black; mso-bidi-font-size: =
12.0pt"><o:p></o:p></SPAN></P></TD></TR>
  <TR>
    <TD colSpan=3D2=20
    style=3D"PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; =
PADDING-TOP: 0in">
      <P class=3DMsoNormal><SPAN=20
      style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; =
mso-bidi-font-family: Arial">Candidates=20
      will need experience in Perl, Python, PHP, Javascript, UNIX, Linux =
or=20
      FreeBSD, MySQL a plus. BS Computer Science or equivalent. At least =
3 years=20
      of Web experience is a plus.</SPAN><SPAN=20
      style=3D"COLOR: black; FONT-STYLE: normal; FONT-WEIGHT: normal; =
mso-bidi-font-size: =
12.0pt"><o:p></o:p></SPAN></P></TD></TR></TBODY></TABLE>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-font-kerning: =
1.0pt; mso-fareast-font-family: 'MS =
Mincho'">&nbsp;<o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-font-kerning: =
1.0pt; mso-fareast-font-family: 'MS Mincho'">If=20
you are interested, please E-mail me in MS Word 95-98 a recent copy of =
your=20
resume and a cover letter with your specific information, including your =
recent=20
compensation package to </SPAN><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-bidi-font-size: =
10.5pt; mso-font-kerning: 1.0pt; mso-fareast-font-family: 'MS =
Mincho'">Position-for:</SPAN><SPAN=20
style=3D"FONT-STYLE: normal; mso-bidi-font-size: 10.5pt; =
mso-font-kerning: 1.0pt; mso-fareast-font-family: 'MS Mincho'">=20
Web Database Programmer<o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; mso-bidi-font-size: 10.5pt; =
mso-font-kerning: 1.0pt; mso-fareast-font-family: 'MS =
Mincho'">&nbsp;<o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-bidi-font-size: =
10.5pt; mso-font-kerning: 1.0pt; mso-fareast-font-family: 'MS =
Mincho'">My=20
personal E-mail is <U><SPAN style=3D"COLOR: =
blue">david@hotjobs2000.com</SPAN></U>=20
or fax your resume to (310) 855-0840. If you have any questions about =
the=20
position(s), please call me at (310) 855-0406 and I will discuss them in =

detail.<o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-bidi-font-size: =
10.5pt; mso-font-kerning: 1.0pt; mso-fareast-font-family: 'MS =
Mincho'">&nbsp;<o:p></o:p></SPAN></P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: none"><SPAN=20
style=3D"FONT-STYLE: normal; FONT-WEIGHT: normal; mso-bidi-font-size: =
10.5pt; mso-font-kerning: 1.0pt; mso-fareast-font-family: 'MS =
Mincho'">We=20
also have developed an interactive Website that you can view over 6000 =
national=20
openings <U><SPAN style=3D"COLOR: =
blue">www.hotjobs2000.com</SPAN></U>.<SPAN=20
style=3D"mso-spacerun: yes">&nbsp; </SPAN>This system is effective, easy =
to use=20
and new positions are posted daily.<SPAN style=3D"mso-spacerun: =
yes">&nbsp;=20
</SPAN>We encourage you to use it and nominate yourself for other =
positions you=20
feel you are qualified for.<SPAN style=3D"mso-spacerun: yes">&nbsp; =
</SPAN>We are=20
looking forward to working with you now and in the future.</SPAN> </P>
<P class=3DMsoNormal=20
style=3D"TEXT-ALIGN: justify; mso-pagination: none; =
mso-layout-grid-align: =
none">&nbsp;<o:p></o:p></P></FONT></DIV></BODY></HTML>

------=_NextPart_000_0005_01BF2AA0.C56185E0--


From S.I.Reynolds@cs.bham.ac.uk  Wed Nov 10 18:05:50 1999
From: S.I.Reynolds@cs.bham.ac.uk (Stuart Reynolds)
Date: Wed, 10 Nov 1999 18:05:50 +0000
Subject: [Doc-SIG] PythonDoc - how to run
Message-ID: <3829B3FE.7A0A@cs.bham.ac.uk>

Hi,

I've just installed PythonDoc on my system hoping to use it produce
documentation for one of our projects. I'm having a bit of trouble
getting it to output anything:


% ls
MDP.py    MDP.pyc
% pythondoc MDP.py
Error: Couldn't import MDP (exceptions.ImportError: No module named MDP)

Same result with,
% pythondoc -d ../docs MDP.py
% pythondoc -d ../docs -i MDP.py
% pythondoc -d ../docs -i -v MDP.py
% pythondoc -d ../docs -i -s ../docs MDP.py

While from the directory above,
% pythondoc reps/MDP.py
% pythondoc -d docs reps/MDP.py

produces nothing (no output and no errors).

Any ideas? Or have I misread the README file?

Cheers

Stuart

PS. This is under Python 1.5.2 on Solaris.


From S.I.Reynolds@cs.bham.ac.uk  Thu Nov 11 13:35:41 1999
From: S.I.Reynolds@cs.bham.ac.uk (Stuart Reynolds)
Date: Thu, 11 Nov 1999 13:35:41 +0000
Subject: [Doc-SIG] PythonDoc - how to run
References: <3829B3FE.7A0A@cs.bham.ac.uk> <E11ls2D-0007Qc-00@lsls4p>
Message-ID: <382AC62D.1720@cs.bham.ac.uk>

This is a multi-part message in MIME format.

--------------7BC813E56
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Edward Welbourne wrote:
> 
> > % pythondoc MDP.py
> > Error: Couldn't import MDP (exceptions.ImportError: No module named MDP)
> 
> Hm.  Try adjusting your PYTHONPATH environment variable

Ha! Well spotted. It now has '.' in (I'd removed it by mistake). Ok
that's fixed the first problem but pythondoc still produces no
documents.


[12:56]~/toolkit >echo $PYTHONPATH
/:/home/pg/sir/toolkit/:.
[12:56]~/toolkit >cd reps
[12:56]~/toolkit/reps >python
Python 1.5.2 (#1, Apr 20 1999, 19:24:22)  [GCC egcs-2.91.57 19980901
(egcs-1.1 re on sunos5
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> import MDP
>>> import sys
>>> sys.path
['', '/', '/home/pg/sir/toolkit/', '.',
'/bham/ums/common/pd/packages/Python/lib/python1.5/',
'/bham/ums/common/pd/packages/Python/lib/python1.5/plat-sunos5',
'/bham/ums/common/pd/packages/Python/lib/python1.5/lib-tk',
'/bham/ums/solaris/pd/bin/../packages/Python-1.5.2/lib/python1.5/lib-dynload',
'/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages',
'/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/numeric',
'/bham/ums/solaris/pd/bin/../packages/Python-1.5.2/lib/python1.5/site-packages',
'/bham/ums/solaris/pd/bin/../packages/Python-1.5.2/lib/python1.5/site-packages/numeric']
>>> ^D
[12:58]~/toolkit/reps >pythondoc MDP.py
[12:58]~/toolkit/reps >pythondoc -d ./ MDP.py
[12:58]~/toolkit/reps >ls
#MDP.py#      MDP.py        MDP.pyc       __init__.py   __init__.pyc  
[12:58]~/toolkit/reps >pythondoc -d ./ -s ./ MDP.py
#MDP.py#      MDP.dtr       MDP.py        MDP.pyc       __init__.py  
__init__.pyc  

Note that I can output the doctree

I've also just tried running pythondoc on the test file included in the
distribution.
This also produces no output.

Cheers

Stuart

--------------7BC813E56
Content-Type: text/plain; charset=us-ascii; name="out.txt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="out.txt"

Error: Couldn't import test.test_al (exceptions.ImportError: No module named al)
math module, testing with eps 1e-05
constants
acos
asin
atan
atan2
ceil
cos
cosh
exp
fabs
floor
fmod
frexp
hypot
ldexp
log
log10
modf
pow
sin
sinh
sqrt
tan
tanh
test
Warning: can't open /bham/ums/common/pd/packages/Python/lib/python1.5/test/output/test
1 test OK.
10 times sub 1.800 CPU seconds
10 times split 1.990 CPU seconds
10 times findall 2.010 CPU seconds
From: bwarsaw@cnri.reston.va.us
Date: Mon Feb 12 17:21:48 EST 1996
To: kss-submit@cnri.reston.va.us
MIME-Version: 1.0
Content-Type: multipart/knowbot;
    boundary="801spam999";
    version="0.1"

This is a multi-part message in MIME format.

--801spam999
Content-Type: multipart/knowbot-metadata;
    boundary="802spam999"


--802spam999
Content-Type: message/rfc822
KP-Metadata-Type: simple
KP-Access: read-only

KPMD-Interpreter: python
KPMD-Interpreter-Version: 1.3
KPMD-Owner-Name: Barry Warsaw
KPMD-Owner-Rendezvous: bwarsaw@cnri.reston.va.us
KPMD-Home-KSS: kss.cnri.reston.va.us
KPMD-Identifier: hdl://cnri.kss/my_first_knowbot
KPMD-Launch-Date: Mon Feb 12 16:39:03 EST 1996

--802spam999
Content-Type: text/isl
KP-Metadata-Type: complex
KP-Metadata-Key: connection
KP-Access: read-only
KP-Connection-Description: Barry's Big Bass Business
KP-Connection-Id: B4
KP-Connection-Direction: client

INTERFACE Seller-1;

TYPE Seller = OBJECT
    DOCUMENTATION "A simple Seller interface to test ILU"
    METHODS
            price():INTEGER,
    END;

--802spam999
Content-Type: message/external-body;
    access-type="URL";
    URL="hdl://cnri.kss/generic-knowbot"

Content-Type: text/isl
KP-Metadata-Type: complex
KP-Metadata-Key: generic-interface
KP-Access: read-only
KP-Connection-Description: Generic Interface for All Knowbots
KP-Connection-Id: generic-kp
KP-Connection-Direction: client


--802spam999--

--801spam999
Content-Type: multipart/knowbot-code;
    boundary="803spam999"


--803spam999
Content-Type: text/plain
KP-Module-Name: BuyerKP

class Buyer:
    def __setup__(self, maxprice):
        self._maxprice = maxprice

    def __main__(self, kos):
        """Entry point upon arrival at a new KOS."""
        broker = kos.broker()
        # B4 == Barry's Big Bass Business :-)
        seller = broker.lookup('Seller_1.Seller', 'B4')
        if seller:
            price = seller.price()
            print 'Seller wants $', price, '... '
            if price > self._maxprice:
                print 'too much!'
            else:
                print "I'll take it!"
        else:
            print 'no seller found here'

--803spam999--

--801spam999
Content-Type: multipart/knowbot-state;
    boundary="804spam999"
KP-Main-Module: main


--804spam999
Content-Type: text/plain
KP-Module-Name: main

# instantiate a buyer instance and put it in a magic place for the KOS
# to find.
__kp__ = Buyer()
__kp__.__setup__(500)

--804spam999--

--801spam999--
Traceback (innermost last):
  File "/bham/ums/common/pd/bin/pythondoc", line 4, in ?
    pythondoc.pythondoc.generate_pages(modules, formats)
  File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/pythondoc.py", line 256, in generate_pages
    docobject = docobjects.create_docobject(object)
  File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 479, in create_docobject
    object = _class_map[type(pyobject)](pyobject) #, name)
  File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 211, in __init__
    Composite.__init__(self, object, name)
  File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 154, in __init__
    Object.__init__(self, object, name)
  File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 40, in __init__
    self.subobjects()
  File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 95, in subobjects
    self.__subobjects = self.get_subobjects()
  File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 169, in get_subobjects
    items = self.get_allobjects()
  File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 243, in get_allobjects
    if module.__name__ != modulename:
AttributeError: 'None' object has no attribute '__name__'

--------------7BC813E56--


From fdrake@acm.org  Thu Nov 11 17:00:45 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 11 Nov 1999 12:00:45 -0500 (EST)
Subject: [Doc-SIG] Approaches to structuring module documentation
Message-ID: <14378.63037.571200.652453@weyr.cnri.reston.va.us>

--1wGfXqjHrK
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit


  Well, now that things have quieted down a little (where?!), I'll
stir things up a little.
  Two broad approaches to structuring the documentation have been
presented:  One is the current document-centric model, where there are
a number of books/manuals/whatever that contain interesting
information, but need to be used as really large chunks.  Extracting
specific information is (appearantly) difficult for humans (witness
the recent request for a random() function on the newsgroup by someone 
who said they looked in the index; just the wrong one); it's much
worse for applications.  The other approach, first proposed by Sean
McGrath, is to use a "microdocument" architecture, where each module
is represented in a separate structured document that is designed
specifically to handle that kind of information.
  First, I'll define some terms and comment on both approaches.

Terms
-----

  DOCUMENT-ORIENTED CONTENT:  Documents which are structured similarly
to the traditional presentation form; document-oriented DTDs feature
things like chapters, sections, titles, articles, etc.  This is what
David Megginson called "book" DTDs in "Structuring XML Documents."

  DOCUMENT-CENTRIC APPROACH: The human-read document is the primary
way to encode information, including module reference material.  A
"monumental" DTD would dedscribe the document structure.  Supplemental
data files could be used for highly specialized information; these
could use alternate DTDs.

  MICRODOCUMENT APPROACH:  Multiple DTDs are used to encode
document-level information and module reference material.  Let's only
consider the case of one DTD to handle module reference material, and
a small number (1 or 2) of document-oriented DTDs; possibly one for
"sections" and one that could be used to compose sections and module
references into chapters and manuals.

Document-centric Approach
-------------------------

  This approach has the advantage of matching the current structure of 
the documentation.  The conversion isn't terribly difficult or even
time consuming given the state of the things in Doc/tools/sgmlconv/ in
the CVS repository.  There's clearly some work to do regarding DTD
specification and probably a bit of transformation, but a large part
of the coding and testing is done.
  The existing documents are tolerably organized for direct human use,
and incremental updates to the documents seem to work well.
  Documenting a module using the document-centric approach requires
little effort due to the simplicity of the existing markup, but it's
not always clear what things "go together."  This problem can be at
least partly solved by evolving the markup to support additional forms 
of linkages between information chunks, and keeping the processing
tools up to date with the markup changes.  This can be done before or
after a conversion to XML as it is largely orthagonal to syntax.


Microdocument Approach
----------------------

  Using a separate DTD to document modules offers advantages when it
comes time to extract information programmatically.  Creating skeleton 
module references from the current documentation would be harder and
would certainly require more code to be written, but the payoffs are
potentially very high.  To really make it work, a lot of attention
would have to be applied to the result of the first-stage conversion
to check the accuracy of the results, make the various bits of text
actually land in the right place (since everything is pretty much
thrown together now), and encode a lot of additional information about 
types, parameters, exceptions thrown, etc.  On the other hand, getting 
this information into the documents in the document-centric approach
also requires a lot of this work.
  An IDE could use the content provided by the module references very
effectively to provide help and smart name completion.  For
performance, the documentation would probably be loaded into some sort
of database so chunks of information could be retrieved very quickly,
and probably in some pre-digested form.  Inheritance diagrams can be
generated, and protocols/interfaces can be documented much more
clearly.
  The most significant drawback I can see is that the markup can very
easily become quite heavy, but this isn't unusual when there's a lot
of structured information to present.


Comparison
----------

  A wide variation in module documentation styles is possible using
the document-centric approach.  While most of the modules in the
Library Reference are presented in a fairly formulaic way, some are
not.  Note the chapters on the debugger and profiler, which really
don't use the styles used elsewhere in the Library Reference.  I'm not
sure if allowing this level of flexibility is good or bad; I could
make the case for both.  I can also see where allowing both could be a
good idea, but it may be reasonable to require a "standard" structure
for module documentation, regardless of the approach taken on the
whole, and then allow additional material to be provided using
document-oriented content.
  At any rate, last night I sat down with one module and the existing
documentation for it, and marked up a module reference for it using
the microdocument approach.  The markup is quite heavy compared to the 
current LaTeX file:

weyr(.../Doc/lib); wc libmailbox.tex mailbox.xml 
      53     251    1938 libmailbox.tex
     159     504    5364 mailbox.xml
     212     755    7302 total

  That's a 200% increase in line count and a 150% increase in file
size.  The later isn't much of an issue, but the former is because it
seriously impacts readability.
  This explosion of markup is of most concern for authors; a lot of
markup is required to encode enough information to justify changing
the approach.  As more markup is required, it is increasingly
difficult to get contributions because it takes the authors more time
to document their work.  I'd like to maintain Python's standing as the 
best-documented free scripting language, and I'm not sure authors will 
be willing to use the more extensive markup.
  I'd also need a small (large?) army of volunteers to help convert
the generated skeleton module references to take advantage of the
ability to encode far more detail about modules than is currently
available.  Are there enough people sufficiently interested?  Doing
this one myself would require someone directly supporting the work;
the occaissional evening would not get it done.


A Hybrid Approach
-----------------

  A hybrid approach could be taken in which the architecture is that
of the microdocument approach, but we support something similar to the 
current (document-centric) approach for the document-oriented content
components.  This would allow a slower migration and facilities such
as the debugger could be documented using the document structure
rather than the module structure.
  The payoffs for application of the documentation are approximately
the same as for the strict microdocument approach.  The most
significant change is probably that some modules (those documented
only in document-oriented components) may not be described in the help 
system, or at least not fully described.
  The issues of conversion are largely the same as for the
microdocument architecture since most modules would be documented in
that way.  The document-oriented DTD(s) may be a little different, but 
that's the only substantial techical difference I see in getting it
done.


Status
------

  I haven't ventured to write a DTD yet for either approach; there's
still a lot to decide before that gets done.  I also don't want to
write a bunch of DTDs that aren't going to be used!
  I think we do need to consider the two approaches in the immediate
future.  Dealing with the legacy conversion software is tolerable for
now, but it's getting worse over time.  Rich linking is difficult in
the HTML output, which seems to be the most-used format, but I think
that's something that a lot of people would like to see.
  If we elect to go with the document-centric approach, there's a bit
of DTD design to do, and a bit of tweaking in the conversion tools,
but we're a long way there.
  Adopting the microdocument approach offers the advantages of a very
high long-term payoff, which is appealing, but please consider my
comments and pleas above carefully.
  The hybrid approach can be considered as roughly the same as the
microdocument approach, as discussed above.

  Comments?


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


--1wGfXqjHrK
Content-Type: text/xml; charset=iso-8859-1
Content-Description: Sample module reference.
Content-Disposition: inline;
	filename="mailbox.xml"
Content-Transfer-Encoding: 7bit

<?xml version="1.0" encoding="iso-8859-1"?>
<module-reference>
  <module-info>
    <module>mailbox</module>
    <synopsis>Read various mailbox formats.</synopsis>
    <!-- possibly add "requires" or "imports" information here, as -->
    <!-- well as platform dependence, etc. -->
    </module-info>

  <overview>
    <para>This module defines a number of classes that allow easy and
      uniform access to mail messages in a mailbox.  Most of the
      supported mailbox formats come from the Unix world.</para>

    <para>None of the classes defined in this module lock the
      mailboxes that are accessed; this needs to be handled by
      application code.</para>
    </overview>

  <protocoldesc>
    <protocol>Mailbox</protocol>
    <method name="next">
      <signature>
        <return-value type="rfc822.Message">
          <!-- need a good way to distinguish between protocols and -->
          <!-- types, both visually and in the markup. -->
          The next message in the mailbox.  The message's <member
            of="rfc822.Message">fp</member> will be a
          <protocol>file</protocol> object, but not a real
          <type>file</type> object.  If no messages have been
          read, this will be the first message.  If all messages have
          been read, <constant>None</constant> will be returned.
          </return-value>
        </signature>
      </method>
    </protocoldesc>

  <classdesc>
    <class>UnixMailbox</class>
    <implements>
      <protocol>Mailbox</protocol>
      </implements>
    <description>
      Access a classic Unix-style mailbox, where all messages are
      contained in a single file and separated by <quote>From name
        time</quote> lines.
      </description>
    <constructor>
      <signature>
        <parameter name="fp" protocol="file">
          The file object <param>fp</param> points to the mailbox file.
          </parameter>
        </signature>
      <description>
        <para>Initialize the mailbox object and point to the first
          message in the mailbox.</para>
        </description>
      </constructor>
    </classdesc>

  <classdesc>
    <class>MmdfMailbox</class>
    <implements>
      <protocol>Mailbox</protocol>
      </implements>
    <description>
      <para>Access an <acronym>MMDF</acronym>-style mailbox, where all
        messages are contained in a single file and separated by lines
        consisting of four control-A characters.</para>
      </description>
    <constructor>
      <signature>
        <parameter name="fp" protocol="file">
          The file object <param>fp</param> points to the mailbox file.
          </parameter>
        </signature>
      <description>
        <para>Initialize the mailbox object and point to the first
          message in the mailbox.</para>
        </description>
      </constructor>
    </classdesc>

  <classdesc>
    <class>MHMailbox</class>
    <implements>
      <protocol>Mailbox</protocol>
      </implements>
    <description>
      <para>Access an <acronym>MH</acronym> mailbox, a directory with
        each message in a separate file with a numeric name.  Messages
        that are added to the mailbox after the instance is created
        are not accessible; a new instance is needed to access newly
        added messages.</para>
      </description>
    <constructor>
      <signature>
        <parameter name="dirname" type="string">
          The name of the mailbox directory.
          </parameter>
        </signature>
      <description>
        <para>Initialize the list of messages that can be loaded from
          the mailbox.</para>
        </description>
      </constructor>
    </classdesc>

  <classdesc>
    <class>Maildir</class>
    <implements>
      <protocol>Mailbox</protocol>
      </implements>
    <description>
      <para>Access a Qmail mail directory.  All new and current mail
        for the mailbox is made available.  Messages that are added to
        the mailbox after the instance is created are not accessible;
        a new instance is needed to access newly added messages.
        </para>
      </description>
    <constructor>
      <signature>
        <parameter name="dirname" type="string">
          The name of the mailbox directory.
          </parameter>
        </signature>
      <description>
        <para>The <param>dirname</param> parameter points to the
          mailbox directory.</para>
        </description>
      </constructor>
    </classdesc>

  <classdesc>
    <class>BabylMailbox</class>
    <implements>
      <protocol>Mailbox</protocol>
      </implements>
    <description>
      <para>Access a Babyl mailbox, which is similar to an
        <acronym>MMDF</acronym> mailbox.  Mail messages start with a
        line containing only <literal>'*** EOOH ***'</literal> and end 
        with a line containing only <literal>'\037\014'</literal>.
        </para>
      </description>
    <constructor>
      <signature>
        <parameter name="fp" protocol="file">
          A <protocol>file</protocol> object <param>fp</param> that
          points to the mailbox file.
          </parameter>
        </signature>
      <description>
        <para>Initialize the mailbox object and point to the first
          message in the mailbox.</para>
        </description>
      </constructor>
    </classdesc>
</module-reference>

--1wGfXqjHrK--


From Moshe Zadka <mzadka@geocities.com>  Fri Nov 12 07:29:09 1999
From: Moshe Zadka <mzadka@geocities.com> (Moshe Zadka)
Date: Fri, 12 Nov 1999 09:29:09 +0200 (IST)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <14378.63037.571200.652453@weyr.cnri.reston.va.us>
Message-ID: <Pine.SOL.3.96.991112090802.13456B-100000@sundial>

On Thu, 11 Nov 1999, Fred L. Drake, Jr. wrote:

>   Well, now that things have quieted down a little (where?!), I'll
> stir things up a little.

Very good.
<snipped some generic discussion>

>   MICRODOCUMENT APPROACH:  Multiple DTDs are used to encode
> document-level information and module reference material.  Let's only
> consider the case of one DTD to handle module reference material, and
> a small number (1 or 2) of document-oriented DTDs; possibly one for
> "sections" and one that could be used to compose sections and module
> references into chapters and manuals.

Well, no one who has read other mails of mine here will be surprised at
my whole-hearted embracing of this approach.

<snipped some things, among them discussion of how the whole library
reference is formulaic, except the debugger and profiler>
> I'm not
> sure if allowing this level of flexibility is good or bad; I could
> make the case for both. 

Here's a simple argument against it: in the TODO list, there are requests
for explanations of how to use <suprise!> both the profiler and the
debugger. Guido marked it "a library chapter isn't enough?". And he's
right, but having the structure so flexible tempted guido to put the
chapters in the library reference, instead as seperate documents.

>   That's a 200% increase in line count and a 150% increase in file
> size.  The later isn't much of an issue, but the former is because it
> seriously impacts readability.

Ummmm...it really depends on how much semantic information you put in.

Here's the strongest argument for the microdocument approach: as someone
who uses both Perl and Python (though I much prefer the later), I see the
enormous benefit of a program like perldoc, which could only be written
on a microdocument based infrastructure. For those not familiar with this
program, I will paint a rosy picture of how beautiful the future could be
if we used microdocuments, and written a "pydoc" preprocessor:

pydoc htmllib --> show the documentation of htmllib
pydoc string.reverse --> show the documentation of string.reverse
pydoc -q reverse --> show all FAQs which have the word reverse in them
pydoc -f reverse --> search for a function called "reverse"
.
.
.
(For those of you on WIMP interfaces, substitute a dialog, and a fancy 
window which formats the documents)

We might need a PyML such that XML<PyML<SGML (that is, PyML is not an
application of XML, only of SGML) and a convertor PyML->XPyML such that
XPyML is an application of XML. That way we could have whatever terseness
from SGML we care to implement, and all the power of XML at our back.
--
Moshe Zadka <mzadka@geocities.com>. 
INTERNET: Learn what you know.
Share what you don't.


From Manuel Gutierrez Algaba <irmina@ctv.es>  Fri Nov 12 17:12:17 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Fri, 12 Nov 1999 17:12:17 +0000 (GMT)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <14378.63037.571200.652453@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.3.95.991112155323.763A-100000@localhost>

A heavily technical document!! Far from my usual "way of thinking".

On Thu, 11 Nov 1999, Fred L. Drake, Jr. wrote:

> 
>   DOCUMENT-ORIENTED CONTENT:  Documents which are structured similarly

Is this the LaTeX one ? or the "traditional" XML ?

> 
>   DOCUMENT-CENTRIC APPROACH: The human-read document is the primary

Is this TeEncontreX'es ? Are "module reference material" the 
"\indexpython" things ?

>   MICRODOCUMENT APPROACH:  Multiple DTDs are used to encode
> document-level information and module reference material.  Let's only

What's this ?

> 
> Document-centric Approach
> -------------------------
<description of things very related to TeEncotreX, I think>

> 
> Microdocument Approach
> ---------------------- 
>   Using a separate DTD to document modules offers advantages when it
> comes time to extract information programmatically.  Creating skeleton 
> module references from the current documentation would be harder and
> would certainly require more code to be written, but the payoffs are
> potentially very high. 

To put it short: "Lot of work coding _details_". Just a comment,
python is **much** better than C++, for example, because you
have  no need to declare every type, every detail, even, you can
have large parts of a python programm broken, parts that a C++
compiler would mark as erroneous. 

> To really make it work, a lot of attention
> would have to be applied to the result of the first-stage conversion
> to check the accuracy of the results, make the various bits of text
> actually land in the right place (since everything is pretty much
> thrown together now), and encode a lot of additional information about 
> types, parameters, exceptions thrown, etc. 

More heavy work !

> On the other hand, getting 
> this information into the documents in the document-centric approach
> also requires a lot of this work.

But perhaps, in a more free way. Let's say that document-centric 
( at least TeEncontreX)
seems more robust for mark-up. Anyone may mark wrong, but you can
readjust or define similarities among different markings.

> Comparison
> ----------
>   That's a 200% increase in line count and a 150% increase in file
> size.  The later isn't much of an issue, but the former is because it
> seriously impacts readability.
>   This explosion of markup is of most concern for authors; a lot of
> markup is required to encode enough information to justify changing
> the approach.  As more markup is required, it is increasingly
> difficult to get contributions because it takes the authors more time
> to document their work. 

The biggest problem I see here is that you get a very good documentation
( due to the huge ammount of work) or you get nothing ( the author
doesn't documentate).

It'd be wise to provide several levels of marking-up , so people
can mark-up little by little, some important things first and so...

This is the "TeEncontreX" version of Mailbox, this should
work if you have AnalizaToo.py:

\newcommand{\indexmoduleinbox}{\index{module}\index{mail}\index{inbox}}
\newcommand{\indexshortdescription}{\index{description}}
\newcommand{\indexdescription}{\index{description}}
\newcommand{\indexreturnvalue}{\index{returnvalue}\index{protocol}}
\newcommand{\indexclassdefinition}{\index{classdefinition}\index{protocol}}
\newcommand{\indexMmdfMailbox}{\index{MmdfMailbox}\index{MDMF}}
\newcommand{\indexMHMailbox}{\index{MmdfMailbox}\index{MH}}
\newcommand{\indexMailDir}{\index{Mail}\index{dir}}
\newcommand{\indexBabylMailbox}{\index{Babyl}\index{\indexBabylMailbox}}
\newcommand{\indexMMDF}{\index{MMDF}}

\jiji
mailbox
Read various mailbox formats.

\indexmoduleinbox \indexname \indexshortdescription
\jiji
This module defines a number of classes that allow easy and
 uniform access to mail messages in a mailbox.  Most of the
 supported mailbox formats come from the Unix world.

None of the classes defined in this module lock the
 mailboxes that are accessed; this needs to be handled by
 application code.

\indexdescription \indexmoduleinbox
\jiji
 The next message in the mailbox.  The message's 
 ("rfc822.Message") fp will be a  file object, but not a real
 file object.  If no messages have been  read, this will
 be the first message.  If all messages have
 been read, None will be returned.

\indexmoduleinbox  \indexreturnvalue  \indexrfc822
\jiji
UnixMailbox

Access a classic Unix-style mailbox, where all messages are
 contained in a single file and separated by "From name
 time lines".

The file object fp points to the mailbox file.

Initialize the mailbox object and point to the first
message in the mailbox.

\indexmoduleinbox \indexUnixMailBox \indexclassdefinition
\jiji
MmdfMailbox

Access an MMDF-style mailbox, where all
 messages are contained in a single file and separated by lines
 consisting of four control-A characters.

 The file object fp points to the mailbox file.

 Initialize the mailbox object and point to the first
 message in the mailbox.


\indexmoduleinbox \indexMmdfMailbox \indexclassdefinition
\jiji

 Access an MH mailbox, a directory with
 each message in a separate file with a numeric name.  Messages
 that are added to the mailbox after the instance is created
 are not accessible; a new instance is needed to access newly
 added messages.

\indexmoduleinbox \indexMHMailbox \indexclassdefinition
\jiji
Maildir

Access a Qmail mail directory.  All new and current mail
 for the mailbox is made available.  Messages that are added to
 the mailbox after the instance is created are not accessible;
 a new instance is needed to access newly added messages.

The name of the mailbox directory.

Initialize the list of messages that can be loaded from
 the mailbox.
The dirname parameter points to the mailbox directory.

\indexmoduleinbox  \indexMaildir \indexclassdefinition
\jiji
BabylMailbox

Access a Babyl mailbox, which is similar to an
MMDF mailbox.  Mail messages start with a
 line containing only <literal>'*** EOOH ***'</literal> and end 
 with a line containing only <literal>'\037\014'</literal>.

 A file object fp that  points to the mailbox file.

Initialize the mailbox object and point to the first
 message in the mailbox.

\indexmoduleinbox  \indexBabylMailbox \indexclassdefinition \indexmmdf
\jiji

Just some comments:
- Thinking about it, I mentioned the need for an appropos utility
one year ago, If you realise, this IS the apropos utility!!
- If one name should bear TeEncontreX it'd be : pico-documentation.
Every chunk of info, delimited by \jiji, is independent of 
the rest of the universe and you can have them in different files.
The only thing links to the world are the \newcommand definitions.
Every chunk of info is very, very small, although it could be 
very big. You have absolute freedom in size, and you can 
refine the info as much (as less as you want).

(VERY IMPORTANT POINT):
- Because of the tiny size of every chunk you can analize typical
chunks to interpolate "obvious" marking:

\jiji
BabylMailbox ( this would be marked as name)

(this would be marked as description)
Access a Babyl mailbox, which is similar to an
MMDF mailbox.  Mail messages start with a
 line containing only <literal>'*** EOOH ***'</literal> and end
 with a line containing only <literal>'\037\014'</literal>.

(these as params)
 A file object fp that  points to the mailbox file.

Initialize the mailbox object and point to the first
 message in the mailbox.
...
Or you can simply use "positional" marking into these very
small chunks.
You can have a bunch of small python programms for intelligent
analysis of typical chunks, because they have little, I guess
they'd be easy.

( VERY IMPORTANT POINT):
- As they're very small you can include in docstrings, or simply
as comments ( everywhere ).

(DEFINITE POINT):
- Using a mix of chunks of data and low-intelligent python script
for deciding on (structure, position, "hidden marks"...)
you can create XML code. So that XML can be considered as 
a low level form of chunk of infos.

(Object Oriented point):
- Chunks are nothing less than objects with info and processes
related to them. Do you like objects ? or do you like the
Pascal-like syntax of XML?  Down to XML!!!

Regards/Saludos
Manolo
-------------
My addresses / mis direcciones: 
a="www.ctv.es/USERS/irmina"
b=[("Lritaunas Peki Project", ""),
   ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ),
   ("page of drawing utility for tex ", "/texpython.htm" ),
   ("CrossWordsLand","/cruo/cruo.html")
   ]
for i in b:
  print i[0],":", a+i[1]

  You have to run as fast as you can just to stay where you are. If you want to get anywhere, you'll have to run much faster. -- Lewis Carroll


From fdrake@acm.org  Fri Nov 12 18:07:57 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 12 Nov 1999 13:07:57 -0500 (EST)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <Pine.SOL.3.96.991112090802.13456B-100000@sundial>
References: <14378.63037.571200.652453@weyr.cnri.reston.va.us>
 <Pine.SOL.3.96.991112090802.13456B-100000@sundial>
Message-ID: <14380.22397.664057.212083@weyr.cnri.reston.va.us>

Moshe Zadka writes:
[re: formulaic v. more flexible module references]
 > Here's a simple argument against it: in the TODO list, there are requests
 > for explanations of how to use <suprise!> both the profiler and the
 > debugger. Guido marked it "a library chapter isn't enough?". And he's
 > right, but having the structure so flexible tempted guido to put the
 > chapters in the library reference, instead as seperate documents.

  I'm not entirely sure what you're against: allowing general document 
structure in the library reference manual, or allowing less structured 
content to serve as module references.  But I understand the problem
you're referring to.
  My thought is that the profiler and debugger both need to be
documented in two ways: as modules (they are, and their interfaces are 
directly useful), and as user-support facilities (with more narrative
documentation).  The current problem is that these components are
conflated.  The narrative "how-to-use-it" documentation should be
removed from the Library Reference and made part of a the User's
Manual, which simply hasn't been written (yet -- any takers?).

 > >   That's a 200% increase in line count and a 150% increase in file
 > > size.  The later isn't much of an issue, but the former is because it
 > > seriously impacts readability.
 > 
 > Ummmm...it really depends on how much semantic information you put in.

  Yes, but there's a good bit I expect to be present regardless.
  I think there's a lot to be gained by being able to say "this method 
expects a pathname, an optional string, and an optional integer, and
returns a file-like object."  Saying it in natural language is easy
(if tedious, given the number of functions/methods about which we can
give that level of information), but saying it so tools can handle
it... requires a lot of markup.  ;-)
  We'll need to define a "vocabulary" that can encompass built-in
types, "protocols" (or interfaces, or whatever they can be called),
and actuall classes.  Class names are easy, but protocols and built-in 
types need to be added.  Variations include being able to say "exactly 
this" or "this or a subclass," etc.  It probably makes sense to be
able to say "non-complex numbers," or "standard number types," or
"non-negative integers," etc.

 > Here's the strongest argument for the microdocument approach: as someone
 > who uses both Perl and Python (though I much prefer the later), I see the
 > enormous benefit of a program like perldoc, which could only be written
 > on a microdocument based infrastructure. For those not familiar with this

  Actually, my inclination would to run "pydoc" off a back-end
database rather than directly off the XML.  The database could be
built once from the document sources and then contain data that's been 
as pre-digested as makes sense.  That would be a lot faster than using 
XML; the entries could be pickled objects or whatever makes sense.

 > We might need a PyML such that XML<PyML<SGML (that is, PyML is not an
 > application of XML, only of SGML) and a convertor PyML->XPyML such that
 > XPyML is an application of XML. That way we could have whatever terseness
 > from SGML we care to implement, and all the power of XML at our back.

  I think we can avoid this.  I'd opt simply to use XML and be done
with it before using SGML only as a way to author XML.  If anyone
really wants to do this, the tools are easy enough to build given
ESIS-generating SGML & XML parsers & the tools in
Doc/tools/sgmlconv/.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From Moshe Zadka <mzadka@geocities.com>  Fri Nov 12 20:03:59 1999
From: Moshe Zadka <mzadka@geocities.com> (Moshe Zadka)
Date: Fri, 12 Nov 1999 22:03:59 +0200 (IST)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <14380.22397.664057.212083@weyr.cnri.reston.va.us>
Message-ID: <Pine.SOL.3.96.991112214850.25205B-100000@sundial>

On Fri, 12 Nov 1999, Fred L. Drake, Jr. wrote:

>   I'm not entirely sure what you're against: allowing general document 
> structure in the library reference manual, or allowing less structured 
> content to serve as module references.  But I understand the problem
> you're referring to.

The former. The later would be classified in the "documentation is like 
sex: when it is good it is very good, and when it is bad, it is better
then nothing" dept. I.e., it is better to have as much semantic
information, but even without it, the docs are useful.

>   My thought is that the profiler and debugger both need to be
> documented in two ways: as modules
<snip>
>and as user-support facilities
<snip>

I agree. In fact, the documentation for pdb is horrible as module
documentation. It is quite good as a user's manual.

> The narrative "how-to-use-it" documentation should be
> removed from the Library Reference and made part of a the User's
> Manual, which simply hasn't been written (yet -- any takers?).

Hmmmmm....define User's Manual. What do you want from it?

>   Yes, but there's a good bit I expect to be present regardless.
>   I think there's a lot to be gained by being able to say "this method 
> expects a pathname, an optional string, and an optional integer, and
> returns a file-like object."  Saying it in natural language is easy
> (if tedious, given the number of functions/methods about which we can
> give that level of information), but saying it so tools can handle
> it... requires a lot of markup.  ;-)

Again, it is a "real" problem, not an artifact of the solution: either you
have AI, or you patiently tell the computer what every word means, or you
live in a non-perfect world. Most solutions are a combination of all three
approaches: use a bit of smart in the processor, put some markup, and live
with the fact that some information will require a human to discover ;-)

<snipped discussion of vocabulary>
>   Actually, my inclination would to run "pydoc" off a back-end
> database rather than directly off the XML.  The database could be
> built once from the document sources and then contain data that's been 
> as pre-digested as makes sense.  That would be a lot faster than using 
> XML; the entries could be pickled objects or whatever makes sense.

It doesn't matter: you'd still have to use the micro-document approach
for this to work. I just painted a rosy picture of what it would buy you.

>  > We might need a PyML such that XML<PyML<SGML (that is, PyML is not an
>  > application of XML, only of SGML) and a convertor PyML->XPyML such that
>  > XPyML is an application of XML. That way we could have whatever terseness
>  > from SGML we care to implement, and all the power of XML at our back.
> 
>   I think we can avoid this.  I'd opt simply to use XML and be done
> with it before using SGML only as a way to author XML.  If anyone
> really wants to do this, the tools are easy enough to build given
> ESIS-generating SGML & XML parsers & the tools in
> Doc/tools/sgmlconv/.

Oh, I forgot to spec that XPyML is a proper subset of PyML, so you can
author directly in that. PyML is just thin syntactic sugar, so you can use
some SGML minimizations, which are useful in practice. We can choose just
a few (e.g., I'm for <minimized/word/, mainly because "/" is a rare word
in Python code, and invalid in identifiers)

But I can live with straight XML, if that's the party line.
--
Moshe Zadka <mzadka@geocities.com>. 
INTERNET: Learn what you know.
Share what you don't.


From fdrake@acm.org  Fri Nov 12 21:01:25 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 12 Nov 1999 16:01:25 -0500 (EST)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <Pine.LNX.3.95.991112155323.763A-100000@localhost>
References: <14378.63037.571200.652453@weyr.cnri.reston.va.us>
 <Pine.LNX.3.95.991112155323.763A-100000@localhost>
Message-ID: <14380.32805.469825.418097@weyr.cnri.reston.va.us>

--FBmNV/Tzqn
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit


Manuel Gutierrez Algaba writes:
 > Is this the LaTeX one ? or the "traditional" XML ?

  I would describe the current approach as document-centric.
"Document-oriented" is how I was referring to content which was
naturally organized in documents, as opposed to data-structure-like
constructions such as my sample module reference.
  The actual syntax wasn't specific to any of the three definitions.

 > >   DOCUMENT-CENTRIC APPROACH: The human-read document is the primary
 > 
 > Is this TeEncontreX'es ? Are "module reference material" the 
 > "\indexpython" things ?

  No, by this I meant the entire section documenting the module.

 >   MICRODOCUMENT APPROACH:  Multiple DTDs are used to encode
 > document-level information and module reference material.  Let's only
 > 
 > What's this ?

  I'm not sure what "this" refers to; the term "microdocument
approach"?  I'll be more specific:
  Using a microdocument approach would involve using at least 2 DTDs,
one for module references, and another for "everything else."  Each
module reference would be a document instance all by itself (in the
SGML/XML sense), not just a file that's part of something larger (like 
the current module sections; there's no meaningful way to process them
individually.  To get something like the current Library Reference,
another document (with another DTD) would specify how to put it
together: put this module, then this one, and now that section of
prose; in the next chapter, put ....  We could define separate DTDs to 
document Python modules, C APIs, and more book- or article-like
sections.   Another would be the "glue" that defines a "manual" or
"howto" document.

 > <description of things very related to TeEncotreX, I think>

  From your explanations and looking at TeEncotreX, I'd describe what
you're doing as "indexing": you're assigning terminology from a
controlled vocabulary to each entry in your document base, and using
that as a retrieval mechanism.  I think this is orthagonal to what I'm 
talking about.  Regardless of a move toward a microdocument approach
or document-centric approach, good indexing is critical to make the
information accessible.
  The way you're using it (with lots of small articles) makes it very
microdocument-flavored, aside from lumping all the documents in one
file.

 > To put it short: "Lot of work coding _details_". Just a comment,
 > python is **much** better than C++, for example, because you
 > have  no need to declare every type, every detail, even, you can
 > have large parts of a python programm broken, parts that a C++
 > compiler would mark as erroneous. 

  I agree.  I think things like type annotations should be completely
optional in the documentation.  However, I think there's a lot of
value in supporting annotations that say things like "this returns a
file-like object" that can be interpreted by programmer's tools (help
system in an IDE, pylint-style analyzers, etc.).  So it should be
possible to add interesting annotations, so a programmer can ask a
tool, "What are all the ways I can get a file object?"

 > > To really make it work, a lot of attention
 > > would have to be applied to the result of the first-stage conversion
 > > to check the accuracy of the results, make the various bits of text
 > > actually land in the right place (since everything is pretty much
 > > thrown together now), and encode a lot of additional information about 
 > > types, parameters, exceptions thrown, etc. 
 > 
 > More heavy work !

  But, as you point out for TeEncontreX, it's linear to the volume of
information you have + what you want to get out of it.

 > The biggest problem I see here is that you get a very good documentation
 > ( due to the huge ammount of work) or you get nothing ( the author
 > doesn't documentate).

  We get the later one now!  ;(

 > It'd be wise to provide several levels of marking-up , so people
 > can mark-up little by little, some important things first and so...

  This is another good reason to make a lot of the markup optional; my 
example probably did use "maximal" markup, but went a long way toward
it.  Let's try adjusting the assumed DTD a little, and cut out a fair
bit of the markup (even if it's useful).  The file is attached; here's 
the word count:

weyr(.../Doc/lib); wc libmailbox.tex mailbox.xml mailbox-min.xml 
      53     251    1938 libmailbox.tex
     159     504    5364 mailbox.xml
     118     370    3936 mailbox-min.xml

  Still large, but definately better.  Good enough?  I don't know.
  I do expect that at least one tool will emerge that will take a
Python source file and spit out a skeleton documentation file that can 
be filled in.

 > This is the "TeEncontreX" version of Mailbox, this should
 > work if you have AnalizaToo.py:

  Cool; I'll run this through as soon as your package downloads again!
;-)
  Aha!  You didn't test this!  ;-)

 > Just some comments:
 > - Thinking about it, I mentioned the need for an appropos utility
 > one year ago, If you realise, this IS the apropos utility!!

  Library science types would call this kind of data marking
"indexing". 
  Saludos, amigo!   (Hey, I'm learning Spanish!  Cool! ;)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


--FBmNV/Tzqn
Content-Type: text/xml; charset=iso-8859-1
Content-Description: More minimal sample module reference.
Content-Disposition: inline;
	filename="mailbox-min.xml"
Content-Transfer-Encoding: 7bit

<?xml version="1.0" encoding="iso-8859-1"?>
<module-reference>
  <module-info>
    <module>mailbox</module>
    <synopsis>Read various mailbox formats.</synopsis>
    </module-info>

  <overview>
    <para>This module defines a number of classes that allow easy and
      uniform access to mail messages in a mailbox.  Most of the
      supported mailbox formats come from the Unix world.</para>

    <para>None of the classes defined in this module lock the
      mailboxes that are accessed; this needs to be handled by
      application code.</para>
    </overview>

  <protocoldesc>
    <protocol>Mailbox</protocol>
    <method name="next">
      <return-value>
        A message object, or <constant>None</constant> if there
        aren't any more message in the mailbox.
        </return-value>
      </method>
    </protocoldesc>

  <classdesc>
    <class>UnixMailbox</class>
    <protocol>Mailbox</protocol>
    <description>
      Access a classic Unix-style mailbox, where all messages are
      contained in a single file and separated by <quote>From name
        time</quote> lines.
      </description>
    <constructor>
      <parameter name="fp" protocol="file"/>
      <description>
        <para>Initialize the mailbox object and point to the first
          message in the mailbox.</para>
        </description>
      </constructor>
    </classdesc>

  <classdesc>
    <class>MmdfMailbox</class>
    <protocol>Mailbox</protocol>
    <description>
      <para>Access an <acronym>MMDF</acronym>-style mailbox, where all
        messages are contained in a single file and separated by lines
        consisting of four control-A characters.</para>
      </description>
    <constructor>
      <parameter name="fp" protocol="file"/>
      <description>
        <para>Initialize the mailbox object and point to the first
          message in the mailbox.</para>
        </description>
      </constructor>
    </classdesc>

  <classdesc>
    <class>MHMailbox</class>
    <protocol>Mailbox</protocol>
    <description>
      <para>Access an <acronym>MH</acronym> mailbox, a directory with
        each message in a separate file with a numeric name.  Messages
        that are added to the mailbox after the instance is created
        are not accessible; a new instance is needed to access newly
        added messages.</para>
      </description>
    <constructor>
      <parameter name="dirname" type="string"/>
      <description>
        <para>Initialize the list of messages that can be loaded from
          the mailbox.</para>
        </description>
      </constructor>
    </classdesc>

  <classdesc>
    <class>Maildir</class>
    <protocol>Mailbox</protocol>
    <description>
      <para>Access a Qmail mail directory.  All new and current mail
        for the mailbox is made available.  Messages that are added to
        the mailbox after the instance is created are not accessible;
        a new instance is needed to access newly added messages.
        </para>
      </description>
    <constructor>
      <parameter name="dirname" type="string"/>
      <description>
        <para>The <param>dirname</param> parameter points to the
          mailbox directory.</para>
        </description>
      </constructor>
    </classdesc>

  <classdesc>
    <class>BabylMailbox</class>
    <protocol>Mailbox</protocol>
    <description>
      <para>Access a Babyl mailbox, which is similar to an
        <acronym>MMDF</acronym> mailbox.  Mail messages start with a
        line containing only <literal>'*** EOOH ***'</literal> and end 
        with a line containing only <literal>'\037\014'</literal>.
        </para>
      </description>
    <constructor>
      <parameter name="fp" protocol="file"/>
      <description>
        <para>Initialize the mailbox object and point to the first
          message in the mailbox.</para>
        </description>
      </constructor>
    </classdesc>
</module-reference>

--FBmNV/Tzqn--


From fdrake@acm.org  Fri Nov 12 21:14:58 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 12 Nov 1999 16:14:58 -0500 (EST)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <Pine.SOL.3.96.991112214850.25205B-100000@sundial>
References: <14380.22397.664057.212083@weyr.cnri.reston.va.us>
 <Pine.SOL.3.96.991112214850.25205B-100000@sundial>
Message-ID: <14380.33618.749278.280311@weyr.cnri.reston.va.us>

Moshe Zadka writes:
 > Hmmmmm....define User's Manual. What do you want from it?

  How to work with the interpreter and related tools.  It wouldn't
teach the language, but would teach the environment and provide
reference material for things like the user interface (for PythonWin
or IDLE, or readline information for Unix).  Debuggers and profilers
generally fall into this category of information.

 > Again, it is a "real" problem, not an artifact of the solution: either you
 > have AI, or you patiently tell the computer what every word means, or you
 > live in a non-perfect world. Most solutions are a combination of all three
 > approaches: use a bit of smart in the processor, put some markup, and live
 > with the fact that some information will require a human to discover ;-)

  I agree.  I think we have too little useful markup now.

 > It doesn't matter: you'd still have to use the micro-document approach
 > for this to work. I just painted a rosy picture of what it would buy you.

  No, you can still use a non-microdocument architecture.  I failed to 
present the mega-database-dump model for a reason, though.  ;-)  It's
entirely possible to use the sort of markup I presented in my sample
module reference without using micro-documents.  It just gets very
painful.

 > But I can live with straight XML, if that's the party line.

  I don't think there's a "party line"; I just want to avoid
introducing new dialects and processing stages.  There's enough that
really needs doing on the content side of things that we don't need to 
create new problems.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From Moshe Zadka <mzadka@geocities.com>  Fri Nov 12 22:36:38 1999
From: Moshe Zadka <mzadka@geocities.com> (Moshe Zadka)
Date: Sat, 13 Nov 1999 00:36:38 +0200 (IST)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <14380.33618.749278.280311@weyr.cnri.reston.va.us>
Message-ID: <Pine.SOL.3.96.991113001934.26148A-100000@sundial>

On Fri, 12 Nov 1999, Fred L. Drake, Jr. wrote:

> Moshe Zadka writes:
>  > Hmmmmm....define User's Manual. What do you want from it?
> 
>   How to work with the interpreter and related tools.  It wouldn't
> teach the language, but would teach the environment and provide
> reference material for things like the user interface (for PythonWin
> or IDLE, or readline information for Unix).  Debuggers and profilers
> generally fall into this category of information.

Uh, OK. Would the things that are currently in the Python manual be part
of it? Obscure options like -X, -S or -i?

>  > But I can live with straight XML, if that's the party line.
> 
>   I don't think there's a "party line"; I just want to avoid
> introducing new dialects and processing stages.  There's enough that
> really needs doing on the content side of things that we don't need to 
> create new problems.

I thought we're trying to come up with a party line. Which would include,
among other things, a markup language...I just remarked I'm not adamant
on the SGML->XML stage, though I think it would improve the life of
documentation writers.

I think the looks should be always to Perland, in that respect. They
managed to come up with a standard so good, *every* module from CPAN has
a half-decent documentation at the least, which is very accessible. POD is
very light on the eyes and easy to write, and I do believe XML+ a bit of
SGML minimization could approach the ease, but I doubt XML alone could do
it. Just consider

<function/reverse/

Vs. either

<function>reverse</function>

Or

<function name="reverse"/>

It all depends on how much of SGML we wish to take. Other alternatives
include:

<function>reverse</>

(That last one is my favourite because modifying current XML tools to
deal with it seems relatively easy: an empty closing tag is mapped to the
last tag on the stack)
--
Moshe Zadka <mzadka@geocities.com>. 
INTERNET: Learn what you know.
Share what you don't.


From Manuel Gutierrez Algaba <irmina@ctv.es>  Sat Nov 13 01:11:18 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Sat, 13 Nov 1999 01:11:18 +0000 (GMT)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <14380.32805.469825.418097@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.3.95.991113003951.13180A-100000@localhost>

On Fri, 12 Nov 1999, Fred L. Drake, Jr. wrote:
<Basically, you say that TeEncontreX is indexing and that
is orthogonal of Micro-document, ....>
> 
>   Library science types would call this kind of data marking
> "indexing". 

I do admit that At the beginning they were simple indices, now
they still seem indexes. But you say that "good indexing" is 
crucial. 

If you say it's orthogonal, that means it's independent, so
indexing could start today. But indexing means creating a 
vocabulary, a lexicon. This can be created on the fly at the 
same time the indexing goes on. But when you have such a lexicon,
you would be tempted of using such lexicon or relationships
amongs indixes in your XML marks. So, there's an implication
indexes --> XML

Second, if you have created HTML pages or LaTeX from TeEncontreX
index system, then you'll have a kind of Python Documentation.
If both micro-doc and document-centric have a strong implication
in the generation of Python Doc, it's clear that TeEncontreX
is not so orthogonal. You can say that it's a very weird thing,
or not usual, but in the three fields:
- info storage
- info representation
- lexicon definition

TeEncontreX and XML have a good intersection of functionalities.

And, it's not simple indexing, LaTeX indexing, for example,
doesn't alter the structure of info. TeEncontreX ( it means 
Te Encontre --> I found you ) isolates the info to be indexed
from the rest, so, if you take a Document-centric stuff and you
apply TeEncontreX method in it, you don't have the same doc any 
longer,
but the old doc, divided and attributed in 100 or 200 parts. It's
like an XML, but whose structure is not based upon marks , it's
based in the meanings of indexes and how are related to each other.


Another comment: indexes usually are one-dimensional. 
If you have an item described by many indexes you have something
multi-dimensional or an object.

And finally, whatever decision is chosen. Let it be simple and 
natural, remember that not everybody can speak XML.

Indexes may be help you in discovering the Lexicon of python,
the Knowledge Zones of Python and with this you can
decide the size and depth of any micro-doc or document-centric.

So, I see that Indexes have an inmediate pay-back ( for the user)
and they help further XML (and not XML ) design. Why not start with
them ?

Regards/Saludos
Manolo
-------------
My addresses / mis direcciones: 
a="www.ctv.es/USERS/irmina"
b=[("Lritaunas Peki Project", ""),
   ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ),
   ("page of drawing utility for tex ", "/texpython.htm" ),
   ("CrossWordsLand","/cruo/cruo.html")
   ]
for i in b:
  print i[0],":", a+i[1]

  Well, you know, no matter where you go, there you are. -- Buckaroo Banzai


From Manuel Gutierrez Algaba <irmina@ctv.es>  Sat Nov 13 20:21:20 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Sat, 13 Nov 1999 20:21:20 +0000 (GMT)
Subject: [Doc-SIG] More about indexing, Propaganda and Crazy Wishes
Message-ID: <Pine.LNX.3.95.991113192130.2692A-100000@localhost>

I think it's important to recall that :
"make life to the helpers makes many helpers to join"
"the easier is to help, the more helpers will be"

So here are more general points:
The ideal world for the helper would be the occasional ( almost
at random), anonymous and in his own language documentation.

This is possible. The idea came to me , thinking about TeEncontreX,
( starting to be sick of that name ? :) ). If you have downloaded
it, you'd realise that the amount of files ( up to 700 ) is huge...
But as basically AnalizaToo.py creates a database, it'd be 
not difficult to use it as a CGI script, and create the files
on the fly. But you can have the inverse, you can present 
a random piece of document ( ideally a robot would extract it
at random from the doc, perhaps looking for unattributed pieces
of info ) and then ask the user to attribute it, the way he wanted
to. 

You can handle different languages markup. Thinking about Fred Drake
, I realise that he really deserved an English version of AnalizaToo.py
but I don't want to do low level file-editing, so I have done
a script , available in ...well you know where. Its name is
translations.py . This very small, trivial thing may be used
with \indexl... stuff, ...

BTW, I've released a new version of TeEncontreX : 1.1 . 30 % more
of data. And it's me alone! 

If we can make things easier enough for the people to contribute
we can make them contribute massively and have all python documented
in a matter of weeks!

Make it easy, and it'll be easy done.


Regards/Saludos
Manolo
-------------
My addresses / mis direcciones: 
a="www.ctv.es/USERS/irmina"
b=[("Lritaunas Peki Project", ""),
   ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ),
   ("page of drawing utility for tex ", "/texpython.htm" ),
   ("CrossWordsLand","/cruo/cruo.html")
   ]
for i in b:
  print i[0],":", a+i[1]

  You can never tell which way the train went by looking at the tracks.


From Gerrit Holl <gerrit@nl.linux.org>  Tue Nov 16 16:07:28 1999
From: Gerrit Holl <gerrit@nl.linux.org> (Gerrit Holl)
Date: Tue, 16 Nov 1999 17:07:28 +0100
Subject: [Doc-SIG] Re: SMTP
In-Reply-To: <80rn4o$bnl$1@news.fsu.edu>
References: <80rn4o$bnl$1@news.fsu.edu>
Message-ID: <19991116170728.A30970@optiplex.palga.uucp>

Glenn Kidd wrote:
> Does anyone know where some good info on Python's smtplib?

Well, er... what about the default module index?

> I have looked at
> http://www.python.org but I was wondering if there was any more info out
> there.  Any help would be appreciated.

Isn't it enough?

body = '''From: me <friend@somewhere.org>
To: my brother <brother@somehost.com>
Subject: hello!
X-Mailer: a python script

This is the bodddddddddyyyyyyyyyy........
'''

regards,
Gerrit.

-- 
We are using Linux daily to UP our productivity - so UP yours!
(Adapted from Pat Paulsen by Joe Sloan)


From Gerrit Holl <gerrit@nl.linux.org>  Tue Nov 16 19:13:02 1999
From: Gerrit Holl <gerrit@nl.linux.org> (Gerrit Holl)
Date: Tue, 16 Nov 1999 20:13:02 +0100
Subject: [Doc-SIG] Re: SMTP
In-Reply-To: <19991116170728.A30970@optiplex.palga.uucp>
References: <80rn4o$bnl$1@news.fsu.edu> <19991116170728.A30970@optiplex.palga.uucp>
Message-ID: <19991116201302.A583@optiplex.palga.uucp>

Gerrit Holl wrote:
> Glenn Kidd wrote:
> > Does anyone know where some good info on Python's smtplib?
> 
> Well, er... what about the default module index?

[cut]

Please ignore...
At first my content was interesting for this list but then I changed my
content and I forgot to change the CC: also... Sorry!

-- 
"A word to the wise: a credentials dicksize war is usually a bad idea on the
net."
(David Parsons in c.o.l.development.system, about coding in C.)


From paul@prescod.net  Mon Nov 22 03:46:22 1999
From: paul@prescod.net (Paul Prescod)
Date: Mon, 22 Nov 1999 04:46:22 +0100
Subject: [Doc-SIG] Approaches to structuring module documentation
References: <14378.63037.571200.652453@weyr.cnri.reston.va.us>
Message-ID: <3838BC8E.B3CA2663@prescod.net>

Sorry for the delay on this message. I need a long plane flight to be
able to think about this issue properly.

"Fred L. Drake, Jr." wrote:
> 
>   Well, now that things have quieted down a little (where?!), I'll
> stir things up a little.
>   Two broad approaches to structuring the documentation have been
> presented:  One is the current document-centric model, where there are
> a number of books/manuals/whatever that contain interesting
> information, but need to be used as really large chunks.  Extracting
> specific information is (appearantly) difficult for humans (witness
> the recent request for a random() function on the newsgroup by someone
> who said they looked in the index; just the wrong one); 

I'm preaching to the choir when I say that there are three issues here:

 * content
 * structure of the content
 * presentation

Okay, they are all related but they are still different. If somebody
can't find something, I would tend to try to fix that in the
presentation first, and then in the content and finally in the structure
if all else fails. Let's not jump right to the structure. Consider this
analogy: someone using a word processor cannot figure out how to bold
something so we decide to change the file format? Sure, there is a small
chance that the file format is to blame (e.g. it doesn't support bold!)
but it is much more likely just a UI problem.

Also, consider our choices a graph with two axes:

 * specificity of markup
 * granularity ("library", "package", "module", "class) of file

If you think of it that way, then you realize that you could have a very
generic microdocument architecture (one HTML class per symbol) or a very
specific (PyBook) but ungranular (the WHOLE book) DTD. And of course the
other two options are also availble.

>   This approach has the advantage of matching the current structure of
> the documentation.  The conversion isn't terribly difficult or even
> time consuming given the state of the things in Doc/tools/sgmlconv/ in
> the CVS repository.  There's clearly some work to do regarding DTD
> specification and probably a bit of transformation, but a large part
> of the coding and testing is done.

I believe that this advantage strongly overwhelms any benefits of going
to a more theoretically pure markup. It's taken roughly a year to get
our documents clean enough that they can even move to XML or something
similar. How long would it take to completely reorganize them? You,
Fred, have a job that only partially includes documentation maintenance
and the rest of us are not nearly so interested in re-writing DOCS as we
are in re-writing CODE. I fear that a move to Microdocuments would never
happen.

>   This explosion of markup is of most concern for authors; a lot of
> markup is required to encode enough information to justify changing
> the approach.  As more markup is required, it is increasingly
> difficult to get contributions because it takes the authors more time
> to document their work.  I'd like to maintain Python's standing as the
> best-documented free scripting language, and I'm not sure authors will
> be willing to use the more extensive markup.

That's a killer argument.

>   The hybrid approach can be considered as roughly the same as the
> microdocument approach, as discussed above.

I propose an incremental approach. Let's get to PyBook XML and THEN
re-evaluate PyBook in terms of microdocument. 

Here's an important issue: Perl and Java have achieved a relatively high
level of module documentation conformity by putting the microdocument
*in the code*. This appeals strongly to basic human nature. One file
instead of two. Scroll to the top to fix up the documentation, and so
forth. Python 2 should address this by having a first-class
documentation feature built into the grammar. I would advise that it
should NOT be XML. In fact it should probably be roughly JavaDOC or
POD-ish so that we aren't reinventing the wheel.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
Bart: Dad, do I really have to brush my teeth?
Homer: No, but at least wash your mouth out with soda.


From Manuel Gutierrez Algaba <irmina@ctv.es>  Mon Nov 22 18:49:22 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Mon, 22 Nov 1999 18:49:22 +0000 (GMT)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <3838BC8E.B3CA2663@prescod.net>
Message-ID: <Pine.LNX.3.95.991122160045.770C-100000@localhost>

On Mon, 22 Nov 1999, Paul Prescod wrote:

> Sorry for the delay on this message. I need a long plane flight to be
> able to think about this issue properly.
> 
>  * content
>  * structure of the content
>  * presentation
> 
> Okay, they are all related but they are still different. If somebody
> can't find something, I would tend to try to fix that in the
> presentation first, and then in the content and finally in the structure
> if all else fails. Let's not jump right to the structure. Consider this
> analogy: someone using a word processor cannot figure out how to bold
> something so we decide to change the file format? Sure, there is a small

I agree. The first issue is supply content to the user, and the 
structure of the content will be resolved when we have enough
content to build it. 

The TeEncontreX (www.ctv.es/USERS/irmina/TeEncontreX.html)
provides the less structure possible, it's pure marking of contents
and a bit of relationship between contents, and it's pure presentation..

I really can't understand why people don't get interested in it. 
I think most people think that the complex solution for a problem
is the best solution....

> Also, consider our choices a graph with two axes:
> 
>  * specificity of markup
>  * granularity ("library", "package", "module", "class) of file

 specifity of markup 
 |     I                     J
 |             X
 |
 |
 |     T
 |------------------------------------
    library package module class          granularity

T : TeEncontreX
J: javadoc
X: XML
I: emacs texinfo
?? It'd be this way ? If so, it's clear that specific markup
makes it more difficult, and that granularity is not a problem.

> documentation feature built into the grammar. I would advise that it
> should NOT be XML. In fact it should probably be roughly JavaDOC or
> POD-ish so that we aren't reinventing the wheel.

I think we must invent the wheel, or at least improve it. I don't 
like javadoc, it seems to me very low level ( type-driven ), useful
for java, but python deserves a higher level stuff.  The doc of
a language is related to the very nature of that language, it's not
the same a prolog documentation than COBOL doc. As I think Python
is Lisp with OO, it should have a Lisp-ish doc... The question
is : how is Lisp doc ? :-) 

BTW: Are you the Paul Prescod who wrote "Manual de XML" with 
Charles F. GoldFarb . Prentice  Hall and... ?

Regards/Saludos
Manolo
www.ctv.es/USERS/irmina/TeEncontreX.html   /texpython.htm

  Nasrudin walked into a teahouse and declaimed, "The moon is more useful than the sun." "Why?", he was asked. "Because at night we need the light more."


From fdrake@acm.org  Mon Nov 22 18:37:08 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 22 Nov 1999 13:37:08 -0500 (EST)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <3838BC8E.B3CA2663@prescod.net>
References: <14378.63037.571200.652453@weyr.cnri.reston.va.us>
 <3838BC8E.B3CA2663@prescod.net>
Message-ID: <14393.36180.24259.143634@weyr.cnri.reston.va.us>

Paul Prescod writes:
 >  * content
 >  * structure of the content
 >  * presentation

  A reminder of what the axes really are is always nice!

 > Okay, they are all related but they are still different. If somebody
 > can't find something, I would tend to try to fix that in the
 > presentation first, and then in the content and finally in the structure
 > if all else fails. Let's not jump right to the structure. Consider this

 > Also, consider our choices a graph with two axes:
 > 
 >  * specificity of markup
 >  * granularity ("library", "package", "module", "class) of file
 > 
 > If you think of it that way, then you realize that you could have a very
 > generic microdocument architecture (one HTML class per symbol) or a very
 > specific (PyBook) but ungranular (the WHOLE book) DTD. And of course the
 > other two options are also availble.

  I'd describe the current markup as being highly specific, and I
think that makes authoring much easier in many ways (there's a limit
to what needs to be typed to mark something in an interesting way).
However, there are a bunch of marks that can be reasonably made, and
there are a few people out there who think documentation isn't
intrinsically interesting(!); this means they don't read the
documentation for the markup (which is incomplete anyway), and there's 
some resistence to having to type much to "mark" something.  This
leads me to think that less marking would be nice.

I said:
 >   This approach has the advantage of matching the current structure of
 > the documentation.  The conversion isn't terribly difficult or even
 > time consuming given the state of the things in Doc/tools/sgmlconv/ in
[...]

Paul Prescod writes:
 > I believe that this advantage strongly overwhelms any benefits of going
 > to a more theoretically pure markup. It's taken roughly a year to get
 > our documents clean enough that they can even move to XML or something

  I'm not convinced.  If what we end up with is little different from
what we have, I don't see why we need to convert at all.  There are
plenty of people who don't *like* LaTeX syntax, but those people won't 
be any happier with XML; I'd expect them to be less happy because
there's more characters in the syntax.  (On the other hand, the syntax 
is more clearly defined and involves fewer special characters, which
is one of the advantages Guido sees with XML or even a carefully
chosen SGML declaration.)

 > similar. How long would it take to completely reorganize them? You,
 > Fred, have a job that only partially includes documentation maintenance
 > and the rest of us are not nearly so interested in re-writing DOCS as we
 > are in re-writing CODE. I fear that a move to Microdocuments would never
 > happen.

 > That's a killer argument.

  That's been my biggest concern about it all.  When working with
this, I'm often in a quandry over how to get more detail out of the
documentation without ending up being the author of the whole ball of
wax.

 > I propose an incremental approach. Let's get to PyBook XML and THEN
 > re-evaluate PyBook in terms of microdocument. 

  Does an incremental approach really make sense?  I suspect we want
to avoid having to give module authors a new set of tools to do
(essentially) the same thing too often.  Regardless of the merits of
the new tools.  (Where "tools" can include things like markup
vocabularies and syntax.)  This is a problem because it leads to
increased resistence from potential authors.

 > Here's an important issue: Perl and Java have achieved a relatively high
 > level of module documentation conformity by putting the microdocument
 > *in the code*. This appeals strongly to basic human nature. One file

  After talking with Guido about these issues last week, I've been
looking into this more.  I've been discussing the benefits & failings
of POD with Greg Ward (of distutils fame), who was a Perl programmer
well before he learned Python.  Needless to say, he's a *huge* fan of
inline documentation (and lots of it).
  So I've been playing with a little tool to create documentation from 
a Python parse tree.  As with many things, it's been done before, but
with limited success (docco, gendoc, pythondoc).  I suspect the
success rate is probably tightly with it being declared "the right
way" by Guido.
  The script isn't near ready, but I'm aiming for being able to
generate documentation one module or one package at a time with at
least reasonable levels of internal linking among HTML files (other
formats can wait; I want a hypertext format first to make sure I get
the linking right).  Once I have this, I should be able to construct a 
system that allows the docs to be created using either some XML/SGML
language in a separate file or this POD-like/structured text inside
the source file.  Building a reference manual from those inputs would
be very similar to what we have now, and is more a matter of gluing
pieces together.
  Another advantage of using inline documentation in the sources is
that the source can be used as part of the markup; a lot of
information is already in the parse tree.  Using that information to
augment the explicit documentation may prove to be very valuable,
especially for people interested in including lots of specific details 
in the documentation.  The programmer should be able to declare that
this not be done, preferably at both global and local levels within a
package or module, since there are many situations in which the
specific structure of the code is downright misleading in terms
describing the public interface.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From paul@prescod.net  Tue Nov 23 14:38:04 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 23 Nov 1999 15:38:04 +0100
Subject: [Doc-SIG] Approaches to structuring module documentation
References: <14378.63037.571200.652453@weyr.cnri.reston.va.us>
 <3838BC8E.B3CA2663@prescod.net> <14393.36180.24259.143634@weyr.cnri.reston.va.us>
Message-ID: <383AA6CC.FE4E1A8C@prescod.net>

"Fred L. Drake, Jr." wrote:
> 
>   I'm not convinced.  If what we end up with is little different from
> what we have, I don't see why we need to convert at all.  

I think we want to be able to slice and dice the documentation with
tools like Zope, 4XSLT, xt.exe and Internet Explorer 5.0

>   So I've been playing with a little tool to create documentation from
> a Python parse tree.  

Would the right compromise be to have very specific documentation inside
the Python and relatively generic documentation outside?

The reason I proposed that we might need a grammar change is because
when I last thought about this I couldn't figure out how to document
parameters without repeating their names. Maybe that's not such a
significant thing though. Also, I was trying to avoid using comments
because I wanted the same documentation to be available as docstrings.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
It's such a 
Bore
Being always
Poor
LANGSTON HUGHES
http://www.northshore.net/homepages/hope/engHughes.html


From paul@prescod.net  Tue Nov 23 14:36:22 1999
From: paul@prescod.net (Paul Prescod)
Date: Tue, 23 Nov 1999 15:36:22 +0100
Subject: [Doc-SIG] Approaches to structuring module documentation
References: <Pine.LNX.3.95.991122160045.770C-100000@localhost>
Message-ID: <383AA666.1BE107AD@prescod.net>

Manuel Gutierrez Algaba wrote:
> 
> I think we must invent the wheel, or at least improve it. I don't
> like javadoc, it seems to me very low level ( type-driven ), useful
> for java, but python deserves a higher level stuff.  

I don't follow. Perhaps you could give an example. Anyhow, we want to be
very careful not to stray too far into only accepting a solution that is
"best" and not the one that is good enough. 

> BTW: Are you the Paul Prescod who wrote "Manual de XML" with
> Charles F. GoldFarb . Prentice  Hall and... ?

Well, I will admit to being the co-author of the XML Handbook. I will
decide my relationship to "Manual de XML" when you tell me how good the
translation was. :)

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
It's such a 
Bore
Being always
Poor
LANGSTON HUGHES
http://www.northshore.net/homepages/hope/engHughes.html


From fdrake@acm.org  Tue Nov 23 16:51:48 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 23 Nov 1999 11:51:48 -0500 (EST)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <383AA6CC.FE4E1A8C@prescod.net>
References: <14378.63037.571200.652453@weyr.cnri.reston.va.us>
 <3838BC8E.B3CA2663@prescod.net>
 <14393.36180.24259.143634@weyr.cnri.reston.va.us>
 <383AA6CC.FE4E1A8C@prescod.net>
Message-ID: <14394.50724.408903.46036@weyr.cnri.reston.va.us>

Paul Prescod writes:
 > I think we want to be able to slice and dice the documentation with
 > tools like Zope, 4XSLT, xt.exe and Internet Explorer 5.0

  An excellent point.  It would be nice to be able to use something
other than PyDOM to manipulate the Python documentation!  ;-)

 > Would the right compromise be to have very specific documentation inside
 > the Python and relatively generic documentation outside?

  It's not such a clean division, I think.  I'm not doing anything
about extension modules, so I need to be able to provide documentation 
about those modules outside the source code.

 > The reason I proposed that we might need a grammar change is because
 > when I last thought about this I couldn't figure out how to document
 > parameters without repeating their names. Maybe that's not such a
 > significant thing though. Also, I was trying to avoid using comments
 > because I wanted the same documentation to be available as docstrings.

  I don't think it's really significant; we can't use an IDREF
attribute in SGML/XML without repeating the ID assigned to the target, 
and the locality of reference is a much greater problem there.  The
Python Tutorial and Guido's "Python Style Guide" essay both describe
some ways to format docstrings such that information extraction is
isn't too hard; those guidelines can be augmented a bit and combined
with limited markup using something that looks like the paragraph-
level analysis from the old structured-text discussion a more explicit 
(but minimal) markup similar to POD's C<...> for inline constructs.
  This sort of in-source documentation, with a little intelligent
analysis, can be used to generate XML, HTML, or whatever fairly
easily.  If a good module-reference DTD can be created (even if part
of a macro-document DTD), that can provide for both the XML output
from the extraction tool and an authoring format for extension modules 
(or other modules if the author has reasons for not using the
in-source documentation; the doc author may work for a different
organization, etc.).


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From irmina@ctv.es  Tue Nov 23 18:52:27 1999
From: irmina@ctv.es (Manuel Gutierrez Algaba)
Date: Tue, 23 Nov 1999 18:52:27 +0000 (GMT)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <383AA666.1BE107AD@prescod.net>
Message-ID: <Pine.LNX.3.95.991123175008.3519A-100000@localhost>

On Tue, 23 Nov 1999, Paul Prescod wrote:

> Manuel Gutierrez Algaba wrote:
> > 
> > I think we must invent the wheel, or at least improve it. I don't
> > like javadoc, it seems to me very low level ( type-driven ), useful
> > for java, but python deserves a higher level stuff.  
> 
> I don't follow. Perhaps you could give an example. Anyhow, we want to be
> very careful not to stray too far into only accepting a solution that is
> "best" and not the one that is good enough. 

I have had these doubts when working with python: 
- How to do a specific thing ? ( how to get a random number? 
how to get the time?  which module or which builtin function ? )
- If python can do : -1 * ( 3, 5). It can't, although it could
give : (-3, -5). 
- How to inheritate safely ? 
- How to use CGI or Unix related stuff with python ?
- How much memory a construction takes? How the memory is released ?
How to improve speed a bit ? 

Most of my doubts can be resumed into one: The need of "high level
information". I've never needed to know the signature of a function,
or If i needed it , i found it seamlessly. I don't need javadoc,
nor pythondoc, not at all certainly, I can read the .py , which
are more compact than html info of java, usually. 

Lots of people, specially newbies, have the same problem: Doubts
but not about the parameters of a function, ... real BIG DOUBTS! 
I don't care either about the genealogy of any class, if I care, 
I just follow the code! Even TkInter code!

The funny (or sad) thing is that most of the info is available,
 out there,
hidden in nice layouts and documents, spread over USENET, FAQs, 
modules, tutorial....

Imagine that you want to produce html code. Well, I'd go 
to /usr/local/lib/pyt... and then I'd have a glance to htmllib,
well, it's undocumented ! or at least the 1.5.2 version I have,
but that seems a parser, not a writer of html... Well, it happens 
that the StructuredText.py of pythondoc does exactly what I want. 
How could I known that ? Simply marking it with:
\indexhtmlgeneration or something similar. 

Lots and lots of programms and modules perform auxiliary tasks, or
many of their tasks are reusable (  you know , OO programming)
or they have small pieces of info ( examples of UNIX CGI programming,
environment variables, tk, ...).

Well, If that's we want to, why don't we just do it? We have 
only to MARK that small pieces of info. Fred calls this indexing.

Many people, me included, won't accept a method of doc that requires
a sintax or the loss of freedom when doing things ( this includes
docstrings ). But it's acceptable to write python code, and then 
put:
"\indexexamplelambda" or 
"\indexsocket"
because that marking would be useful even for the programmer himself,
even he eventually got accostumed to doc things just writting indexes and
... but that'd be the second and third step. Anyway, most people
won't be angry against such a simple measure. 


I suppose that this is like Parnassus's stuff: put a black background,
some .gif's, and PostGress database and you got a GREAT site. :-P

If Fred thinks that providing a "smart" way of showing how many
modules, functions and expections are in a piece of python code is 
enough,
I think that with that we don't go much further.

Here the Pop SuperStar is "Information" ( brute, massive, overwhelming
info... ). We need just a method of gluing as fast and as simple 
as possible "tons" of info. And that method is indexing ( as brute,
massive and overwhelming as "Information" is).

No traditional parser-driven or XML-driven or javadoc-driven approach
will bear the richness, complexity, diversity of origin/source,
 state,... of so many info. 

BTW: Documenting the python library would be the minor  and less
interesting thing by far. 

We have a Golden Chance here, we can have the best info system
of the Internet. Don't try to be The Big Hero, the Big guy who finds 
the smart solution. Here the Only One Hero is Information, the 
Information that should be at last available !

Indexing for the masses !! Masses of indexes !! Simple and effective!

Death or victory ! 

Regards/Saludos
Manolo
www.ctv.es/USERS/irmina/TeEncontreX.html   /texpython.htm

  Just remember, wherever you go, there you are. -- Buckaroo Bonzai


From fdrake@acm.org  Tue Nov 23 22:13:46 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 23 Nov 1999 17:13:46 -0500 (EST)
Subject: [Doc-SIG] Re: Documenting Python, Take 2
In-Reply-To: <199911232129.OAA03343@localhost.localdomain>
References: <14378.63037.571200.652453@weyr.cnri.reston.va.us>
 <199911232129.OAA03343@localhost.localdomain>
Message-ID: <14395.4506.538171.718001@weyr.cnri.reston.va.us>

uche.ogbuji@fourthought.com writes:
 > F1) A highly-structured format for archiving and manipulation of low-level 
 > documentation (what Fred Drake is calling "microdocuments").  Example: SGML
 > or XML schema.  This format must be semantically complete, easy to
 > manipulate in code, with a broad toolset available for manipulation.

Uche,
  I'd *love* to see a good definition of "semantically complete" for
Python!  ;-)

 > F2) An author-friendly format for low-level documentation.  F2 has to be 
 > structured enough for meaningful conversion to F1, but terse enough for use
 > in in-line documentation and adoption by authors for whom F1 would be too
 > much of a chore.  Example: javadoc, POD.

  I'm working on this one, along with an extraction tool.

 > F3) An author and maintainer-friendly format for general documentation,
 > such as the Python profiler and debugger docs as well as the User guide and
 > all that.  Example: Docbook, RTF.  Abundance of author and manipulation
 > tools is important for this format.

  Yes; I think SGML/XML is probably fine for this.


 > T1: A tool for conversion from F1 to F2 and back.

  I understand the need for F2-->F1, but why F1-->F2?  It certainly
could not be general unless F1 is heavier than I imagine.  Please
provide the rationale for the F1-->F2 requirement.

 > T2: A tool for interactively querying authors for documentation elements: 
 > basically a knowledge-acquisition tool from python module experts. (Maybe
 > you can guess what one of our recent contracts has been).

  This might be cool.  We could then go from a parse tree (.py file;
F2) to skeletal F1, and then augment using the interactive tool.  In
practice, perhaps you really point the tool to the source file and
skip storing the skeletal F1 to disk if you aren't going to intervene
with a text editor at that point.  Allowing either to be accepted by
the tool is probably a good idea.  That would allow both documentation 
creation and editing within the tool.

 > T3: A tool for generating user-friendly doc-strings into python modules
 > from the information in F1.

  This sounds the same as T1 to me; do you see F2 being used outside
of docstrings?  (I've been working under the rubric that I should pull 
as much as possible out of the code to get the best possible docs when 
the programmer doesn't provide any additional information.)

 > T4: A command line tool that can display user-friendly docs from a
 > database of F1 docs, similar to perldoc.

  Agreed.

 > T5: A tool for turning F1 and F3 into the familiar Python User Guide and 
 > Library Reference, preferably with richer linking.

  That's within my current plans (i.e., I haven't written it yet).
Rich hypertext is one of the most important benefits I see for
ditching the current tool chains -- doing much through LaTeX2HTML can
be quite painful!

 > T6: A tool for generating man-pages based on F1 Documentation.  This would 
 > address the insistent crowing of Tom Christiansen about Python's "man-page 
 > envy"

  Perhaps we could ask Tom to write this?  ;)  Since Tom's last
crusade against the Python documentation, I've had one user comment
that they'd like to see man pages for Python (paraphrased: it'd be
nice to have).  Tom's the only user to say that the rest doesn't count 
(regardless of how many words he took to say it).

 > In a separate message, I'll make a proposal based on this meta-proposal.

  I look forward to it!


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From fdrake@acm.org  Tue Nov 23 21:03:33 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 23 Nov 1999 16:03:33 -0500 (EST)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <Pine.LNX.3.95.991123175008.3519A-100000@localhost>
References: <383AA666.1BE107AD@prescod.net>
 <Pine.LNX.3.95.991123175008.3519A-100000@localhost>
Message-ID: <14395.293.372163.433259@weyr.cnri.reston.va.us>

Manuel Gutierrez Algaba writes:
 > Well, If that's we want to, why don't we just do it? We have 
 > only to MARK that small pieces of info. Fred calls this indexing.

  Please don't think I'm the only one!  I just applied the name to the 
situation.  Indexing is a good thing.

 > Many people, me included, won't accept a method of doc that requires
 > a sintax or the loss of freedom when doing things ( this includes
 > docstrings ). But it's acceptable to write python code, and then 
 > put:
 > "\indexexamplelambda" or 
 > "\indexsocket"
 > because that marking would be useful even for the programmer himself,
 > even he eventually got accostumed to doc things just writting indexes and
 > ... but that'd be the second and third step. Anyway, most people
 > won't be angry against such a simple measure. 

  The "simple" part of this isn't the problem, though it does turn
into one.  This approach hinges on what's called a "controlled
vocabulary" (another one of those good Information Science words!).
Without some agreement on the terms that enter the index, many related 
things are not similarly indexed.  Achieving consistency in the case
of many author/indexers is very difficult without either a well-
defined controlled vacobulary or strong editorial oversight.  The
later is (by far) easier to implement, and is what I've tried to
provide for the standard documentation.  (Compare this to a controlled 
vocabulary approach; think "Library of Congress Subject Headings," or
other large cataloging systems used in libraries.  Ever wonder why
programming language books appear in at least a couple of different
places in the computer science section of a good university library?)
Editorial control is tedious and can become difficult; but controlled
vocabularies are the child of committees!  (Which doesn't mean they're 
not useful, just that there's an *enormous* overhead to using them.)

 > If Fred thinks that providing a "smart" way of showing how many
 > modules, functions and expections are in a piece of python code is 
 > enough,
 > I think that with that we don't go much further.

  Actually, I don't think that's enough, or that it solves that
particular problem.
  The purpose of extracting information from the Python sources is not 
so much as to provide new information (though it may) as to ease the
burden on those authoring documentation.  (Which does *not* mean me!)
I'd like to see newly released modules from independent developers be
documented in a consistent way; making this easy is a necessity for it 
to happen.
  There are still several things which have to be done, including
index building.  One of the catches of index building is that building 
a really useful index (not just a comprehensive one) is fundamentally
a hard thing to do.  I recently spoke to someone who once managed half 
of the indexing team at the Encyclopedia Britanica about this, and
find that it's not at all clear what actually needs to be done to
improve the situation.  A *large* index, especially when presented
"book style," is not particularly desirable.

 > No traditional parser-driven or XML-driven or javadoc-driven approach
 > will bear the richness, complexity, diversity of origin/source,
 >  state,... of so many info. 

  I agree:  No automatic method will replace good human indexing.

 > BTW: Documenting the python library would be the minor  and less
 > interesting thing by far. 

  I think the current library documentation is actually pretty good;
I'm interested in improving both the content and accessibility (via
indexing or any other approach).  The Doc-SIG has long had the mandate 
of moving Python out of the LaTeX prehistoric period into the 21st
century, however, which is one of motivations for the work done to
move from LaTeX to SGML/XML/whatever-comes-next.  I know I've been
beating on Guido about this for 4 1/2 years!

 > We have a Golden Chance here, we can have the best info system
 > of the Internet. Don't try to be The Big Hero, the Big guy who finds 
 > the smart solution. Here the Only One Hero is Information, the 
 > Information that should be at last available !

  Perhaps we need another defined task for the SIG:  locate all the
resources that should be part of this all-encompassing Python
Documentation Web?  That's no small task!  Perhaps you'd like to start 
of list of the documents you think should be included in the indexing
effort, including current links to them.  A Web page that simply lists 
them would be a good start.

 > Death or victory ! 

  Don't do that!  While alive you can work to improve things, once
dead... well, *I've* never met anyone who came back.


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From uche.ogbuji@fourthought.com  Tue Nov 23 21:29:59 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 23 Nov 1999 14:29:59 -0700
Subject: [Doc-SIG] Documenting Python, Take 2
In-Reply-To: Your message of "Thu, 11 Nov 1999 12:00:45 EST."
 <14378.63037.571200.652453@weyr.cnri.reston.va.us>
Message-ID: <199911232129.OAA03343@localhost.localdomain>

Sorry it took so long to get around to this.

I think my earlier approach (let's call it a meta-proposal) to settling on a 
Python documentation system still applied, with some modifications.

My original posting is at

http://www.python.org/pipermail/doc-sig/1999-September/000726.html

But in light of the current discussion and concerns that have been raised, 
changes are in order.

Python Documentation Python Meta-Proposal
-----------------------------------------

I think in essence we must quickly decide on a set of documentation formats 
and enabling tools, and then answer the questions of how to get there from 
where we are.  A step-wise transition, as Paul suggests, is fine, but I think 
it is important for us all to have a vision of where we're going.

FORMATS:

F1) A highly-structured format for archiving and manipulation of low-level 
documentation (what Fred Drake is calling "microdocuments").  Example: SGML or 
XML schema.  This format must be semantically complete, easy to manipulate in 
code, with a broad toolset available for manipulation.

F2) An author-friendly format for low-level documentation.  F2 has to be 
structured enough for meaningful conversion to F1, but terse enough for use in 
in-line documentation and adoption by authors for whom F1 would be too much of 
a chore.  Example: javadoc, POD.

F3) An author and maintainer-friendly format for general documentation, such 
as the Python profiler and debugger docs as well as the User guide and all 
that.  Example: Docbook, RTF.  Abundance of author and manipulation tools is 
important for this format.


CUSTOM TOOLS:

T1: A tool for conversion from F1 to F2 and back.

T2: A tool for interactively querying authors for documentation elements: 
basically a knowledge-acquisition tool from python module experts. (Maybe you 
can guess what one of our recent contracts has been).

T3: A tool for generating user-friendly doc-strings into python modules from 
the information in F1.

T4: A command line tool that can display user-friendly docs from a database of 
F1 docs, similar to perldoc.

T5: A tool for turning F1 and F3 into the familiar Python User Guide and 
Library Reference, preferably with richer linking.

T6: A tool for generating man-pages based on F1 Documentation.  This would 
address the insistent crowing of Tom Christiansen about Python's "man-page 
envy"

In a separate message, I'll make a proposal based on this meta-proposal.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From janssen@parc.xerox.com  Wed Nov 24 03:00:55 1999
From: janssen@parc.xerox.com (Bill Janssen)
Date: Tue, 23 Nov 1999 19:00:55 PST
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: Your message of "Tue, 23 Nov 1999 13:03:33 PST."
 <14395.293.372163.433259@weyr.cnri.reston.va.us>
Message-ID: <99Nov23.190055pst."3638"@watson.parc.xerox.com>

Well, I might as well put in my two cents worth.

The GNU Info versions of the Python documentation are the most
important to me, as I can put those right into my Emacs and have them
at the tip of my fingers while programming.  Whatever solution is
found, I'd like to see that continued.

There's some logic to javadoc, I suppose, in that the most common
problem with documentation is that it goes out of date, because the
modifier just changes the code.  If the documentation is mixed with
the code, perhaps that probability is reduced (though I'm not aware of
any studies that show this to be true).  Though something like
Literate Programming might be a better system.  Perhaps adapting a
system like Noweb (http://www.eecs.harvard.edu/~nr/noweb/) would be of
help.  The home page lists a number of programming languages, but not
Python.

Bill


From fdrake@acm.org  Wed Nov 24 15:07:27 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 24 Nov 1999 10:07:27 -0500 (EST)
Subject: [Doc-SIG] Approaches to structuring module documentation
In-Reply-To: <99Nov23.190055pst."3638"@watson.parc.xerox.com>
References: <14395.293.372163.433259@weyr.cnri.reston.va.us>
 <99Nov23.190055pst."3638"@watson.parc.xerox.com>
Message-ID: <14395.65327.352843.663682@weyr.cnri.reston.va.us>

Bill Janssen writes:
 > Well, I might as well put in my two cents worth.

  Surely we can ask three or four cents worth from you; we've
certainly not hesitated sending comments to the ILU list at various
times!  ;-)

 > The GNU Info versions of the Python documentation are the most
 > important to me, as I can put those right into my Emacs and have them
 > at the tip of my fingers while programming.  Whatever solution is
 > found, I'd like to see that continued.

  I don't expect there will be any reduction in the set of output
formats.  Which is not to say that info will be the first one
produced, but it's a safe bet it'll stay around.  I suspect it'll be
far easier to maintain if it's no longer dependent on the HTML
rendering of the docs as well.

 > There's some logic to javadoc, I suppose, in that the most common
 > problem with documentation is that it goes out of date, because the
 > modifier just changes the code.  If the documentation is mixed with
 > the code, perhaps that probability is reduced (though I'm not aware of
 > any studies that show this to be true).  Though something like

  I think I've seen references to studies that showed it both ways, so 
I suspect that the specific set of programmers studied remains a
poorly understood variable (hey, we're not numbers, we're variables!).
  I hope that a moderate amount of what gets marked up in JavaDoc
comments would be generated automatically for Python, but I suspect
that will be very difficult and prone to be wrong if people make heavy 
use of Python's dynamic features.  But that's the case now as well,
and screwing with the standard library is a de-facto no-no.

 > Literate Programming might be a better system.  Perhaps adapting a
 > system like Noweb (http://www.eecs.harvard.edu/~nr/noweb/) would be of
 > help.  The home page lists a number of programming languages, but not
 > Python.

  Now John Skaller will probably suggest that we all adopt
Interscript.  ;-)
  I'm not entirely sure what we'd expect to get out of a literate
programming system that we can't get out of a JavaDoc/POD/Structured-
Text-derived system, at least as far as module reference material
goes.  I've not done enough literate programming to be a good judge of 
how well it really works for library code.  I'd love it if someone
would send me a pointer to a really nice example of literate
programming of a library that provided reference documentation,
introductory material, and examples of use.  (Esp. if there's both
online and typeset versions of the documentation to look at.)
  I would *not* expect this to be applied to the Python libraries,
however, since there are too many hands in there to get a major shift
in methodology.  Just getting decent docstrings will be hard enough
some days!


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From paul@prescod.net  Wed Nov 24 15:25:57 1999
From: paul@prescod.net (Paul Prescod)
Date: Wed, 24 Nov 1999 16:25:57 +0100
Subject: [Doc-SIG] Approaches to structuring module documentation
References: <Pine.LNX.3.95.991123175008.3519A-100000@localhost>
Message-ID: <383C0385.10EC2C0D@prescod.net>

Manuel Gutierrez Algaba wrote:
> 
> Most of my doubts can be resumed into one: The need of "high level
> information".

The lack of documentation for high level features (or the poor indexes
for them) cannot be solved by new documentation systems. It must be
solved by new *documentation* (or at least indexes). There is nothing in
the current system that precludes solving the system you describe.

> I've never needed to know the signature of a function,
> or If i needed it , i found it seamlessly. I don't need javadoc,
> nor pythondoc, not at all certainly, I can read the .py , which
> are more compact than html info of java, usually.

But not hypertext navigable, not nicely formatted and not appropriate
for printing out.

> 
> "\indexexamplelambda" or 
> "\indexsocket"

> Indexing for the masses !! Masses of indexes !! Simple and effective!

I agree with a need for indexing, but I think it is a separate issue
with separate solutions. Those solutions are mostly unrelated to the
markup strategy problem. As you point out, the markup for indexing is
simple. Has Fred points out, the name management is tricky!

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"Like most religious texts, the XML 1.0 spec has proven itself 
internally-inconsistent, so we're going to have to invent some kind of 
exegetical method now to show how it's really all an allegory." - Anon


From gerrit@nl.linux.org  Wed Nov 24 19:27:51 1999
From: gerrit@nl.linux.org (Gerrit Holl)
Date: Wed, 24 Nov 1999 20:27:51 +0100
Subject: [Doc-SIG] Re: SMTP?
In-Reply-To: <19991124171120.1623.qmail@hotmail.com>; from b2blink@hotmail.com on Wed, Nov 24, 1999 at 06:11:20PM +0100
References: <19991124171120.1623.qmail@hotmail.com>
Message-ID: <19991124202751.A5717@stopcontact.palga.uucp>

Ulf Engstr�m wrote:
> I've build a little mailthingy based on the 11.9.2 SMTP Example  from Python 
> Library Reference but when I use it I'll get an empty mail with no sender 
> and no msg, eventhough I don't get any errors whatsoever...Do I have to 
> change something with the headers? If yes, what?

Hmm, questions about smtplib are asked *very* often. Is the documentation
clear enough?

-- 
"The move was on to 'Free the Lizard'"

  -- Jim Hamerly and Tom Paquin (Open Sources, 1999 O'Reilly and Associates)
  8:26pm  up  2:50,  9 users,  load average: 2.07, 1.98, 1.92


From uche.ogbuji@fourthought.com  Thu Nov 25 02:00:45 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Wed, 24 Nov 1999 19:00:45 -0700
Subject: [Doc-SIG] Documenting Python, Take 2 (RE-POST)
In-Reply-To: Your message of "Thu, 11 Nov 1999 12:00:45 EST."
 <14378.63037.571200.652453@weyr.cnri.reston.va.us>
Message-ID: <199911250200.TAA07515@localhost.localdomain>

Pardon me if you get this twice, but I had problems with mail at python.org 
yesterday

Sorry it took so long to get around to this.

I think my earlier approach (let's call it a meta-proposal) to settling on a 
Python documentation system still applied, with some modifications.

My original posting is at

http://www.python.org/pipermail/doc-sig/1999-September/000726.html

But in light of the current discussion and concerns that have been raised, 
changes are in order.

Python Documentation Python Meta-Proposal
-----------------------------------------

I think in essence we must quickly decide on a set of documentation formats 
and enabling tools, and then answer the questions of how to get there from 
where we are.  A step-wise transition, as Paul suggests, is fine, but I think 
it is important for us all to have a vision of where we're going.

FORMATS:

F1) A highly-structured format for archiving and manipulation of low-level 
documentation (what Fred Drake is calling "microdocuments").  Example: SGML or 
XML schema.  This format must be semantically complete, easy to manipulate in 
code, with a broad toolset available for manipulation.

F2) An author-friendly format for low-level documentation.  F2 has to be 
structured enough for meaningful conversion to F1, but terse enough for use in 
in-line documentation and adoption by authors for whom F1 would be too much of 
a chore.  Example: javadoc, POD.

F3) An author and maintainer-friendly format for general documentation, such 
as the Python profiler and debugger docs as well as the User guide and all 
that.  Example: Docbook, RTF.  Abundance of author and manipulation tools is 
important for this format.


CUSTOM TOOLS:

T1: A tool for conversion from F1 to F2 and back.

T2: A tool for interactively querying authors for documentation elements: 
basically a knowledge-acquisition tool from python module experts. (Maybe you 
can guess what one of our recent contracts has been).

T3: A tool for generating user-friendly doc-strings into python modules from 
the information in F1.

T4: A command line tool that can display user-friendly docs from a database of 
F1 docs, similar to perldoc.

T5: A tool for turning F1 and F3 into the familiar Python User Guide and 
Library Reference, preferably with richer linking.

T6: A tool for generating man-pages based on F1 Documentation.  This would 
address the insistent crowing of Tom Christiansen about Python's "man-page 
envy"

In a separate message, I'll make a proposal based on this meta-proposal.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From uche.ogbuji@fourthought.com  Thu Nov 25 09:27:29 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Thu, 25 Nov 1999 02:27:29 -0700
Subject: [Doc-SIG] Documenting Python, Take 2
Message-ID: <199911250927.CAA08568@localhost.localdomain>

Based on my meta-proposal, here is my suggestion for the Zen of Python 
documentation.

My vote for format F1 is an XML schema.  Fred's wonder about "semantically 
complete is duly noted.  Let's have this as a start

Module name
Module description
Global Object References
Functions
	Description
	Parameters (name and description)
	Return value (description)
Classes
	Methods (see functions, maybe flag for initializer)
	Class-level object refs
etc.

Fred's example from his original message is a decent start.  Remember that F1 
needn't be terse.


My vote for F2 is a modification of javadoc.  It's very well known and very 
successful.  Off-head, we should be able to use

@version
@author
@param
@return
@exception
@see

without modification.  "@see" would be _very_ nice, wouldn't it?

We would need some additions, such as @module.


My vote for F3 is docbook.  There are tools to turn docbook into HTML, GNU 
info, *roff (man pages), ps, pdf, etc.  There is an O'Reilly book out on it, 
an emacs mode, etc.


I would volunteer to write a python-javadoc to XML converter.  Note: I'm not 
saying I won't help unless my suggestions are accepted.  I'm just volunteering 
for a known quantity that I know I can handle.

FourThought already has an internal tools for querying an author for 
documentation automatically.  We could adapt this to the new DTD that is 
determined, and donate it to the cause.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From uche.ogbuji@fourthought.com  Thu Nov 25 09:55:25 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Thu, 25 Nov 1999 02:55:25 -0700
Subject: [Doc-SIG] Re: Documenting Python, Take 2
In-Reply-To: Your message of "Tue, 23 Nov 1999 17:13:46 EST."
 <14395.4506.538171.718001@weyr.cnri.reston.va.us>
Message-ID: <199911250955.CAA08660@localhost.localdomain>

> uche.ogbuji@fourthought.com writes:
>  > F1) A highly-structured format for archiving and manipulation of low-level 
>  > documentation (what Fred Drake is calling "microdocuments").  Example: SGML
>  > or XML schema.  This format must be semantically complete, easy to
>  > manipulate in code, with a broad toolset available for manipulation.
> 
> Uche,
>   I'd *love* to see a good definition of "semantically complete" for
> Python!  ;-)

OK, I overstated it.  A "semantically-complete" format would likely have to 
document all the nonterminals of the Python grammar.  Let's say "reasonably 
complete".

>  > F2) An author-friendly format for low-level documentation.  F2 has to be 
>  > structured enough for meaningful conversion to F1, but terse enough for use
>  > in in-line documentation and adoption by authors for whom F1 would be too
>  > much of a chore.  Example: javadoc, POD.
> 
>   I'm working on this one, along with an extraction tool.

What does it look like?

>  > F3) An author and maintainer-friendly format for general documentation,
>  > such as the Python profiler and debugger docs as well as the User guide and
>  > all that.  Example: Docbook, RTF.  Abundance of author and manipulation
>  > tools is important for this format.
> 
>   Yes; I think SGML/XML is probably fine for this.

Repeating myself, I vote Docbook.

>  > T1: A tool for conversion from F1 to F2 and back.
> 
>   I understand the need for F2-->F1, but why F1-->F2?  It certainly
> could not be general unless F1 is heavier than I imagine.  Please
> provide the rationale for the F1-->F2 requirement.

My thinking was that some users would appreciate the concise form in their 
distro for quick reference without weighing doen their modules (and memory 
foot-print) with the heavyweight F1.

>  > T2: A tool for interactively querying authors for documentation elements: 
>  > basically a knowledge-acquisition tool from python module experts. (Maybe
>  > you can guess what one of our recent contracts has been).
> 
>   This might be cool.  We could then go from a parse tree (.py file;
> F2) to skeletal F1, and then augment using the interactive tool.  In
> practice, perhaps you really point the tool to the source file and
> skip storing the skeletal F1 to disk if you aren't going to intervene
> with a text editor at that point.  Allowing either to be accepted by
> the tool is probably a good idea.  That would allow both documentation 
> creation and editing within the tool.

Well, I hadn't thought of the parse-tree angle, though that would be cool.  
The internal tool FourThought has in this vein is merely a menu-driven 
approach.  The author selects "add new function", "modify class", and all 
that.  It's up to him or her to determine the elements to be documented.  We 
have plans for a web-i-fied version of the tool.

>  > T3: A tool for generating user-friendly doc-strings into python modules
>  > from the information in F1.
> 
>   This sounds the same as T1 to me; do you see F2 being used outside
> of docstrings?  (I've been working under the rubric that I should pull 
> as much as possible out of the code to get the best possible docs when 
> the programmer doesn't provide any additional information.)

No.  F2 in my mind is not user-friendly.  I'm talking about something that 
converts "@param" to "parameter" with some salsa and bean dip tossed in to 
make it all palatable.

>  > T6: A tool for generating man-pages based on F1 Documentation.  This would 
>  > address the insistent crowing of Tom Christiansen about Python's "man-page 
>  > envy"
> 
>   Perhaps we could ask Tom to write this?  ;)  Since Tom's last
> crusade against the Python documentation, I've had one user comment
> that they'd like to see man pages for Python (paraphrased: it'd be
> nice to have).  Tom's the only user to say that the rest doesn't count 
> (regardless of how many words he took to say it).

Then let's just leave off that bit.  Of course, if we use Docbook, making man 
pages should be no great endeavor.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From fredrik@pythonware.com  Thu Nov 25 11:38:44 1999
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Thu, 25 Nov 1999 12:38:44 +0100
Subject: [Doc-SIG] Re: SMTP?
References: <19991124171120.1623.qmail@hotmail.com> <19991124202751.A5717@stopcontact.palga.uucp>
Message-ID: <00cc01bf3739$a1d83d30$f29b12c2@secret.pythonware.com>

Gerrit Holl <gerrit.holl@pobox.com> wrote:
> Ulf Engstr�m wrote:
> > I've build a little mailthingy based on the 11.9.2 SMTP Example  from Python
> > Library Reference but when I use it I'll get an empty mail with no sender
> > and no msg, eventhough I don't get any errors whatsoever...Do I have to
> > change something with the headers? If yes, what?
>
> Hmm, questions about smtplib are asked *very* often. Is the documentation
> clear enough?

no.

the library is pretty low-level, and the documentation
doesn't add much (it points to the relevant RFC's, but
nobody seems to be following those links).

I once contributed a (IMHO) better example, which
1) actually imported all modules that were used in
the example, 2) used more reasonable python con-
structs (raw_input instead of that prompt hack, etc),
and 3) showed how to add the basic headers to the
message body.

as far as I can tell, only (1) made it into the docs...

</F>

<!-- (the eff-bot guide to) python network programming
http://www.pythonware.com/people/fredrik/networkbook.htm
-->


From Manuel Gutierrez Algaba <irmina@ctv.es>  Fri Nov 26 19:51:06 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Fri, 26 Nov 1999 19:51:06 +0000 (GMT)
Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion
In-Reply-To: <14395.293.372163.433259@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.3.95.991126171327.3232A-100000@localhost>

I've decided on making a try of project for my proposal, as I 
think It's intrinsically good. I've placed at:

http://www.ctv.es/USERS/irmina/SantisimaInquisicion/index.html

Basically I'll place there the instructions of how to attribute,
the documents preferrably to be documented ...

I'll maintain this project for two weeks, so:

- if people doesn't involve in the project and send attributions,
then I won't maintain it any longer
- if people send thousands and thousands of attributions, then I'll
pass it to someone of python.org, because then it'd be a rather
official thing. 

Only I'll maintain it if it has a moderate success. 

I'd like you a couple of things:

- Announce this project as semi-official in comp.lang.python
or/and in the python.org announcement page

- Declare this aim of collecting info as interest of the SIG, so
even after death of SantaInquisicion the idea will be alive. 

- Support it in any way you may think. 

( the definitive support's would be from Guido's, what does 
he think about the idea )

I think the project will fail ( people are basically lazy ), 
but it will be a good lesson/precedent for the future.

Anyway, the idea is worth a try!

Look at SantisimaInquisicion


Regards/Saludos
Manolo
www.ctv.es/USERS/irmina/TeEncontreX.html   /texpython.htm
www.ctv.es/USERS/irmina/SantisimaInquisicion/index.html

  Do your part to help preserve life on Earth -- by trying to preserve your own.


From da@ski.org  Fri Nov 26 18:58:55 1999
From: da@ski.org (David Ascher)
Date: Fri, 26 Nov 1999 10:58:55 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion
In-Reply-To: <Pine.LNX.3.95.991126171327.3232A-100000@localhost>
Message-ID: <Pine.WNT.4.04.9911261056140.198-100000@rigoletto.ski.org>

On Fri, 26 Nov 1999, Manuel Gutierrez Algaba wrote:

> 
> I've decided on making a try of project for my proposal, as I 
> think It's intrinsically good. I've placed at:
> 
> http://www.ctv.es/USERS/irmina/SantisimaInquisicion/index.html
> 
> Basically I'll place there the instructions of how to attribute,

Can you explain how to use this website?  I've looked at it and at 
TeEncontreX, and all I seem to do is to click between pages describing the
system, but I can't find any real *DOC*.

You'll have a hard time getting folks to do anything if you don't give a
specific example of what they'll get out of it...

How about you do the markup for a given module, and show how it looks?

--david


From irmina@ctv.es  Fri Nov 26 20:20:09 1999
From: irmina@ctv.es (Manuel Gutierrez Algaba)
Date: Fri, 26 Nov 1999 20:20:09 +0000 (GMT)
Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion
In-Reply-To: <Pine.WNT.4.04.9911261056140.198-100000@rigoletto.ski.org>
Message-ID: <Pine.LNX.3.95.991126201741.8550A-100000@localhost>

On Fri, 26 Nov 1999, David Ascher wrote:

> 
> On Fri, 26 Nov 1999, Manuel Gutierrez Algaba wrote:
> 
> > 
> > I've decided on making a try of project for my proposal, as I 
> > think It's intrinsically good. I've placed at:
> > 
> > http://www.ctv.es/USERS/irmina/SantisimaInquisicion/index.html
> > 
> > Basically I'll place there the instructions of how to attribute,
> 
> Can you explain how to use this website?  I've looked at it and at 
> TeEncontreX, and all I seem to do is to click between pages describing the
> system, but I can't find any real *DOC*.
> 
> You'll have a hard time getting folks to do anything if you don't give a
> specific example of what they'll get out of it...
> 
> How about you do the markup for a given module, and show how it looks?

http://www.ctv.es/USERS/irmina/SantisimaInquisicion/AutoDeFe.html

There you can see clearly two examples,... please let me know 
if this is not enough.

Anyway, it's funny a DOC project that is undocumented :)

It's so simple that it's difficult to explain. :)

Anyway, if this is not ENOUGH, feel freely to insist and to 
complain bitterly. If there's enough interest I'll explain it
moooree!

Regards/Saludos
Manolo
www.ctv.es/USERS/irmina/TeEncontreX.html   /texpython.htm

  Once you've tried to change the world you find it's a whole bunch easier to change your mind.


From da@ski.org  Fri Nov 26 22:16:52 1999
From: da@ski.org (David Ascher)
Date: Fri, 26 Nov 1999 14:16:52 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion
In-Reply-To: <Pine.LNX.3.95.991126201741.8550A-100000@localhost>
Message-ID: <Pine.WNT.4.04.9911261116460.198-100000@rigoletto.ski.org>

On Fri, 26 Nov 1999, Manuel Gutierrez Algaba wrote:

> http://www.ctv.es/USERS/irmina/SantisimaInquisicion/AutoDeFe.html> 

> There you can see clearly two examples,... please let me know 
> if this is not enough.

I understand the concept of markup.  Your markup is basically TeX.  Fine
for TeX documents, but why in the world do you expect Python users who are
used to 

  def foo():
     print 'a'

to suddenly like

  \newcommand{\indexCextension}{\index{Cextension}\index{extension}}

and why do you think they would even consider adding it to their code?
What is the benefit to them?  You need to show the *end-result* of
indexing, which means hyperlinked TOC's, pretty HTML pages, etc.

Warning: rant ahead.

Generally, I think that the DOC-sig spends too much time arguing about
specific markups and trendy technology (sorry, I'm getting really
frustrated at the XML LPHBTSP (that's 'alphabet soup' without vowels), and
not enough with the marketing aspect.  

*If the problem is to encourage average Python coders to markup their
docs*, then you need to make it simple *and Python-like*.  Define a
Pythonic syntax (e.g what Jim Fulton uses in the StructuredText.py
module), provide a CGI script which has a "PUT" button which will take a
marked-up .py file, creates a hyperlinked TOC for that module, snazzy HTML
pages and whatnot, automatically add said module to some centralized
repository of 'cool documented modules', and folks *will* learn the
markup.  Especially if you provide a few modules which show examples of
the markup and show how trivial it is.

On the other hand, if you put up a page which makes Python code look more
like TeX or XML, why in the world do you expect people to bother?

POD, Javadoc, and Autoduck work because they do 90% of the job with about
4 minute of learning.  That is *all* you can expect of 95% of the
programmers out there.  Go for the biggest bang for the buck.

Enough with the rant.  Back to normal DOC-SIG business.

--david


From uche.ogbuji@fourthought.com  Sat Nov 27 01:55:01 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Fri, 26 Nov 1999 18:55:01 -0700
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: Your message of "Fri, 26 Nov 1999 14:16:52 PST."
 <Pine.WNT.4.04.9911261116460.198-100000@rigoletto.ski.org>
Message-ID: <199911270155.SAA00967@localhost.localdomain>

David,

I confess I don't quite get it, which is too bad because I respect your 
opinions and would like to know precisely what you're getting at.

Maybe the problem is that I'm a doc-sig newbie.  I just showed up here a 
couple of weeks ago to mention that we had a few internal tools at FourThought 
for python documentation and I wondered if anyone was interested.

As such, I don't have any sense that the doc-sig has or is failing in any way. 
 The way I see it, Python has a decent amount and quality of documentation.  
True it is not as good as Java's or Perl's but it is about as much as can be 
expected given its age and market profile.  Fred Drake has done a phenomenal 
job.  My understanding is that we're all discussing a way to push it to the 
next level.  If possible, it means leap-frogging Perl and Java, but mostly it 
just means seeking the best solution.  I don't see any need for haste or panic.

I don't understand the reasoning that python docs should look like Python.  
I'm not as familiar with POD, but you also give Javadoc as an example, and it 
looks _nothing_ like Java.  Also note that several people have been advocating 
a Javadoc-like system, including myself.  So where is the terrible divergence?

XML advocates here are mostly suggesting it for the "library" format of python 
documentation, not the "author" format.  So why does it matter if you think 
authors bear such distaste for XML and TeX?  They won't have to deal with it.  
The reality, though, is that it's easier to go from XML or TeX to any of the 
many formats Python users want than it would be from Jim-Fulton-David-Ascher 
pythonic documentation format.  Would you volunteer to write the tools to go 
from JFDA to *roff for man pages, postscript, PDF, HTML and GNU info?  I doubt 
it, and even if you would, I'd advise against re-inventing the wheel that the 
Linux Documentation Project has so admirably crafted.

The way I see it, your key argument with Manuel's proposals would be that he 
plans to inflict TeX on Python authors.  I agree that that is a bad thing, and 
I also wouldn't want to inflict XML on Python Authors.  I don't think that's 
an alien sentiment here, but your rant makes it sound that way.

So, what am I missing?


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From da@ski.org  Sat Nov 27 05:13:31 1999
From: da@ski.org (David Ascher)
Date: Fri, 26 Nov 1999 21:13:31 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <199911270155.SAA00967@localhost.localdomain>
Message-ID: <Pine.WNT.4.05.9911261910450.58-100000@david.ski.org>

On Fri, 26 Nov 1999 uche.ogbuji@fourthought.com wrote:

> David,
> 
> I confess I don't quite get it, which is too bad because I respect your 
> opinions and would like to know precisely what you're getting at.

It's all right -- I wasn't being especially clear.  That's the nature of a
rant, though, so at least you can't get me on the 'truth in advertising'
laws. =)

Here's my take on things, hopefully more rationally thought out and more
clearly expressed:

1) The current Python documentation is, in my opinion, just fine.  I think
   that moving from LaTeX to something more modern is a great idea, and I
   think that Fred is doing a beautiful job for what is a thankless task.
   I have no problem of course with the discussion on that topic.

2) IMHO, the single most problematic aspect of Python documentation is
   the lack of a standard way for programmers to document their code 
   inside the .py file, unlike e.g. POD., and that is a shame.  If nothing
   else, I think that lacking this standard is one of the reasons for the
   lack of docstrings.  If there was a good standard, then the use of
   docstrings already made in IDLE and Pythonwin (which is wonderful)
   could be made even deeper and richer, leading to a snowballing effect.

   There is at least one proposal to index in-code Python docstrings
   with TeX-like commands.  In my opinion, anything that full of
   backslashes and braces will never fly in the Python community.

3) Programmers in general, smart programmers especially, try to "think
   out" all of the possible uses for something before they start to design
   it.  That's why God Invented Managers and deadlines.  We need one or
   the other.

> The way I see it, your key argument with Manuel's proposals would be that he 
> plans to inflict TeX on Python authors.  I agree that that is a bad thing, and 
> I also wouldn't want to inflict XML on Python Authors.  I don't think that's 
> an alien sentiment here, but your rant makes it sound that way.

> So, what am I missing?

Possibly nothing.  I posted in a moment of emotion, which is never a good
idea.  I apologize for the rant.  I suspect that what I was really
reacting to was a combination of:

  - puzzlement as to the motivations for Manuel's proposal,
  - a personal frustration with seeing design-by-committee lead to inaction
  - a strong visceral reaction against TeX-style markup in Python.

I don't mind XML markup so-much, btw, and I suppose no one else in the
HTML age minds much as long as they don't have to mess w/ anything beyond
<foo>...</foo>.  As soon as you allow anything beyond the trivial, you
lose 50% of the audience.  KISS (Keep It Simple, Stupid) rules for these
sorts of things.

Here's what I'd most like to see in the area of in-code doc (I have no
constructive opinion on the 'large' documents markup issue): a definition
for a set of entities (if that is the right word) to use in docstrings for
modules, classes, functions and methods.  After two weeks of discussion
*at most*, Fred brings it to Guido, Guido gets his old melted-wax seal and
stamps his approval on it, and then we advertise the heck out of it.  The
code will follow.

Straw Proposal 0.1 [da]:

  """
  <AUTHOR>David Ascher</AUTHOR>
  <VERSION>1.0</VERSION>
  <DATE>20/10/96</DATE>
  <DESCRIPTION>This is a module with one function in it.</DESCRIPTION>
  <URI>...</URI>
  """

  def len(input):
    """\
    <DESCRIPTION> Returns the length of the input sequence </DESCRIPTION>
    <ARGUMENT name="input" type=sequence> 
        The input sequence
    </ARGUMENT>
    <RETURNTYPE> IntType </RETURNTYPE>
    <INDEXWITH>len, lenth, ln</INDEXWIDTH>
    """
    ...
  
FWIW, I'm really not sure that the above is significantly easier to parse
in the long run than a more Pythonic:

  """
  Author: David Ascher
  Date: 10/25/99
  ...
  """

  def len(input):
      """\
      Description:
          Returns the length of the input sequence.
  
      Arguments:
         input (sequence) -- The input sequence
    
      Return Type: IntType

      See Also: len, length, ln
      """

provided we make explicit exactly the format (just like Python syntax is
formalized and parseable) and keep the fancy stuff (embedded URLs,
hyperlinks, etc.) to a single escape code (e.g. like what Mark Hammond was
pushing for months ago -- see June 1998 archives). That said, what I
really care most about is a final decision, not the specific markup used.  
That's why I think that what we need is a Guido Stamp Of Approval.

--david 'ranted out' ascher

PS: I'll pay for a new melted-wax seal if Guido lost the old one. =)


From pf@artcom-gmbh.de  Sat Nov 27 09:30:45 1999
From: pf@artcom-gmbh.de (Peter Funk)
Date: Sat, 27 Nov 1999 10:30:45 +0100 (MET)
Subject: What is important (was Re: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion)
In-Reply-To: <Pine.WNT.4.04.9911261116460.198-100000@rigoletto.ski.org> from David Ascher at "Nov 26, 1999  2:16:52 pm"
Message-ID: <m11reBp-000CxiC@artcom0.artcom-gmbh.de>

Hi!

David Ascher wrote:
[...]
> Generally, I think that the DOC-sig spends too much time arguing about
> specific markups and trendy technology (sorry, I'm getting really
> frustrated at the XML LPHBTSP (that's 'alphabet soup' without vowels), and
> not enough with the marketing aspect.  
> 
> *If the problem is to encourage average Python coders to markup their
> docs*, then you need to make it simple *and Python-like*.  Define a
> Pythonic syntax (e.g what Jim Fulton uses in the StructuredText.py
> module), provide a CGI script which has a "PUT" button which will take a
> marked-up .py file, creates a hyperlinked TOC for that module, snazzy HTML
> pages and whatnot, automatically add said module to some centralized
> repository of 'cool documented modules', and folks *will* learn the
> markup.  Especially if you provide a few modules which show examples of
> the markup and show how trivial it is.
> 
> On the other hand, if you put up a page which makes Python code look more
> like TeX or XML, why in the world do you expect people to bother?

I agree with David.  I've joined the list only recently (some
weeks ago).  As someone who has yet much to learn, I think that the
current documentation for Python and the module library is very good.
Unfortunately some very important parts are still missing: Something
like Fredrik Lundhs "An Introduction to Tkinter" ---also it is still
somewhat incomplete in some regions--- would be a _VERY_ useful addition
to the library documentation.  

Spending time on something like this seems far more important to me than
this discussion about the topics introduced in Chapter 8 (Future
Directions) of Documenting Python.

Regards, Peter
-- 
Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260
office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen)


From Manuel Gutierrez Algaba <irmina@ctv.es>  Sat Nov 27 11:27:37 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Sat, 27 Nov 1999 11:27:37 +0000 (GMT)
Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion
In-Reply-To: <Pine.WNT.4.04.9911261116460.198-100000@rigoletto.ski.org>
Message-ID: <Pine.LNX.3.95.991127104702.736A-100000@localhost>

On Fri, 26 Nov 1999, David Ascher wrote:

> On Fri, 26 Nov 1999, Manuel Gutierrez Algaba wrote:
> 
> > http://www.ctv.es/USERS/irmina/SantisimaInquisicion/AutoDeFe.html> 
> 
> > There you can see clearly two examples,... please let me know 
> > if this is not enough.
> 
> I understand the concept of markup.  Your markup is basically TeX.  Fine
> for TeX documents, but why in the world do you expect Python users who are
> used to 

It happens that TeX documents are the easiest structure to represent
things, apart from raw .txt  A basically TeX markup is just 
the easiest markup. 

> 
>   def foo():
>      print 'a'

This example is too simple , it gots almost no info at all , 
perhaps : \indexbasicdeffunction
(\index{basic}\index{def}\index{function} )
after doing this , anyone searching  for basic things, or for
definitions of for functions may find the piece of code you've done,
your WORK is PROFITABLE now. That piece of CODE IS VALUABLE for 
the rest of community.

> 
> to suddenly like
> 
>   \newcommand{\indexCextension}{\index{Cextension}\index{extension}}
> 
> and why do you think they would even consider adding it to their code?
The answer is the same than: "why do you documentate?"
- To understand what you are doing: If you put \indexCextension,
in that very same word you're resuming the whole functionality
of a piece of code,
- you can have your internal vocabulary inside
your program, so you can browse by different concepts and how they've
been implemented, although this is fairly advanced yet.
- people can understand or join into their databases the information
you've provided. That is, documentation is that thing used for the 
people to understand what others have done.

> What is the benefit to them?  You need to show the *end-result* of
> indexing, which means hyperlinked TOC's, pretty HTML pages, etc.

The *end-result* is Sacramental.html stuff and the rest. But that's
*ONLY* one representation, more than enough, I think for most 
of the cases. It can be prettier but I've showed the most basic .

> 
> Warning: rant ahead.

Warning: more SantisimaInquisicion propaganda ahead !!!

> 
> Generally, I think that the DOC-sig spends too much time arguing about
> specific markups and trendy technology (sorry, I'm getting really
> frustrated at the XML LPHBTSP (that's 'alphabet soup' without vowels), and
> not enough with the marketing aspect.  

Me too, that's why I'm proposing the most simple stuff, I think.

> *If the problem is to encourage average Python coders to markup their
> docs*, then you need to make it simple *and Python-like*.  Define a

The question is stupid markup is going to encourage anything:
Imagine this :

# name   inter
# param list
# param list
def inter(a,b):
    res = []
    for i in a:
        if i in b:
           res.append(i)

Do you think many people are encourage to behave C-ish with python
code. And what kind of info that provides : there's a inter function
, and it has two parameters, but that SAYS NOTHING AT ALL about 
the function itself, so it's useless for anybody searching for
a list intersection function!

> Pythonic syntax (e.g what Jim Fulton uses in the StructuredText.py

Pythonic syntax for a C-ish idea, python deals with functionality
and no with type rubbish.

> markup.  Especially if you provide a few modules which show examples of
> the markup and show how trivial it is.

I've provided the examples, if people don't do it, it's just because
they're too lazy, the marking system I propose can't be easier...

> On the other hand, if you put up a page which makes Python code look more
> like TeX or XML, why in the world do you expect people to bother?

Regretfully, a bit a collateral damages to the code has to be done
to the code, even so, TeEncontreX damages are not too much, a few 
words or lines, here and there. In fact, TeEncontreX markup is the
one that needs less  writting ( by far) and in the most free form ( by
far ),it's rather painfully to be strict , it sounds to me java-ish.
Python is ( by far ) the less strict language I know, in fact,
tight syntax /indent of python is the only (unnoticed) thing, after
wards you enjoy of full power of overloadings eval(), map...
Python is not Java, it's much much more different than it seems at
first sight. Python deserves something strict in the syntax ( \jiji
\indexlkjljk),  flexible in the use ( you can use the indexes you
want, you can place them wherever and whenever you like! ) and 
that deals with the real problem: the problem with DOC is to know
what the CODE is DOING, not their params, functions. The important
thing is the WHAT, let be the HOW for java or for C. 


Regards/Saludos
Manolo
www.ctv.es/USERS/irmina    /TeEncontreX.html   /texpython.htm
 /SantisimaInquisicion/index.html 

  That, that is, is. That, that is not, is not. That, that is, is not that, that is not. That, that is not, is not that, that is.


From Manuel Gutierrez Algaba <irmina@ctv.es>  Sat Nov 27 11:27:48 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Sat, 27 Nov 1999 11:27:48 +0000 (GMT)
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <199911270155.SAA00967@localhost.localdomain>
Message-ID: <Pine.LNX.3.95.991127111515.736B-100000@localhost>

On Fri, 26 Nov 1999 uche.ogbuji@fourthought.com wrote:
> The way I see it, your key argument with Manuel's proposals would be that he 
> plans to inflict TeX on Python authors.  I agree that that is a bad thing, and 
> I also wouldn't want to inflict XML on Python Authors.  I don't think that's 
> an alien sentiment here, but your rant makes it sound that way.

TeX is not TeEncontreX, TeX is something 100 billion times more
complex, TeEncontreX is just three things :
- definition of an index (newcommand )
- delimitation of the space which index(es) is used (\jiji \jaja)
- use of the indexes ( \indeslkjk)

If this is too complex , or if this is TeX, then I'm missing something
very BIG. 

Anyway, I don't want to inflict anything to anybody. For me it's 
enough to leave in the annals of python the documentation system
of the future. This system ( or something similar ) will succeed 
sooner or later ( probably with somebody more powerful than me)
and I'm  more than happy to state here : "I was the first". This is 
indexing but in a non-indexing way: massive and 
inter-index-collaborative.

If I can make this clear enough, or anyone can clearly understand it
and want to spend the time, we'll get some years ahead the rest.
That's all. If not, well, pity!


Manolo
www.ctv.es/USERS/irmina    /TeEncontreX.html   /texpython.htm
 /SantisimaInquisicion/index.html 

  That, that is, is. That, that is not, is not. That, that is, is not that, that is not. That, that is not, is not that, that is.


From sean@digitome.com  Sat Nov 27 10:52:28 1999
From: sean@digitome.com (Sean McGrath)
Date: Sat, 27 Nov 1999 10:52:28 +0000
Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion
In-Reply-To: <Pine.LNX.3.95.991127104702.736A-100000@localhost>
References: <Pine.WNT.4.04.9911261116460.198-100000@rigoletto.ski.org>
Message-ID: <3.0.6.32.19991127105228.0093c4b0@gpo.iol.ie>

[Manuel Gutierrez Algaba]
>
>It happens that TeX documents are the easiest structure to represent
>things, apart from raw .txt  A basically TeX markup is just 
>the easiest markup. 
>
Completely subjective and unsubstantiated statements!

>> 
>>   def foo():
>>      print 'a'
>
>This example is too simple , it gots almost no info at all , 
>perhaps : \indexbasicdeffunction
>(\index{basic}\index{def}\index{function} )
>after doing this , anyone searching  for basic things, or for
>definitions of for functions may find the piece of code you've done,
>your WORK is PROFITABLE now. That piece of CODE IS VALUABLE for 
>the rest of community.

In my opinion, there is *NO CHANCE* that any developer would
voluntarily add th index stuff in this example isn't it
redundant? A text parsing tool that knows the syntax of
Python can work out that foo is a function. There is no need
for a programmer to spell it out in a second
syntactic form right?


From mal@lemburg.com  Sat Nov 27 12:39:58 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 27 Nov 1999 13:39:58 +0100
Subject: [Doc-SIG] On David Ascher's Rant
References: <Pine.WNT.4.05.9911261910450.58-100000@david.ski.org>
Message-ID: <383FD11E.D8ADE8D0@lemburg.com>

David Ascher wrote:
> 
> 1) The current Python documentation is, in my opinion, just fine.  I think
>    that moving from LaTeX to something more modern is a great idea, and I
>    think that Fred is doing a beautiful job for what is a thankless task.
>    I have no problem of course with the discussion on that topic.

I second that.
 
> 2) IMHO, the single most problematic aspect of Python documentation is
>    the lack of a standard way for programmers to document their code
>    inside the .py file, unlike e.g. POD., and that is a shame.  If nothing
>    else, I think that lacking this standard is one of the reasons for the
>    lack of docstrings.  If there was a good standard, then the use of
>    docstrings already made in IDLE and Pythonwin (which is wonderful)
>    could be made even deeper and richer, leading to a snowballing effect.
> 
>    There is at least one proposal to index in-code Python docstrings
>    with TeX-like commands.  In my opinion, anything that full of
>    backslashes and braces will never fly in the Python community.

I don't think people will start to write TeX in their docstrings...
after all not everyone can read plain TeX and will get pretty
confused about all those backslashes and curly brackets.

IMHO, a clean plain text approach goes much further; together
with some conventions on how to format this text and intelligent
tools to extract the information encoded by those conventions
will certainly make the writing docstrings much more popular.

BTW, in case someone cares, the format I use for docstrings and
function/method signature goes as follows:

def normlist(jlist,
                   
             StringType=types.StringType):

    """ Return a normalized joinlist.

        All tuples in the joinlist are turned into real strings.  The
        resulting list is a equivalent copy of the joinlist only
        consisting of strings.
        
    """
    ...

1. Localizations are split from the true input arguments by
   an empty line or a comment line

2. The first line in the docstring includes a short description
   of what the function does.

3. The remaining lines are used for more detailed descriptions.

Additional markup e.g. for cross referencing would be nice
but shouldn't look awkward. One way to do this would be:

a. use .method() for methods of the same class
b. use Class.method() for methods of other classes
c. use *name for referencing defined names in the current context,
   e.g. class names, parameter names, module names, etc.
d. methods/functions which don't have docstrings shouldn't go
   into the automatic documentation output (this feature is often
   forgotten: you may not want to document certain parts of
   you module for some reason)

and so on...

> 3) Programmers in general, smart programmers especially, try to "think
>    out" all of the possible uses for something before they start to design
>    it.  That's why God Invented Managers and deadlines.  We need one or
>    the other.

Right. And it's even worse in the Python community: they first try
to prove NP-completeness rather than think about good reasonable
approaches for the common case.

> Straw Proposal 0.1 [da]:
> 
>   """
>   <AUTHOR>David Ascher</AUTHOR>
>   <VERSION>1.0</VERSION>
>   <DATE>20/10/96</DATE>
>   <DESCRIPTION>This is a module with one function in it.</DESCRIPTION>
>   <URI>...</URI>
>   """
> 
>   def len(input):
>     """\
>     <DESCRIPTION> Returns the length of the input sequence </DESCRIPTION>
>     <ARGUMENT name="input" type=sequence>
>         The input sequence
>     </ARGUMENT>
>     <RETURNTYPE> IntType </RETURNTYPE>
>     <INDEXWITH>len, lenth, ln</INDEXWIDTH>
>     """
>     ...

Are you serious about the above ??? Noone is going to write that
in his docstrings...
 
> FWIW, I'm really not sure that the above is significantly easier to parse
> in the long run than a more Pythonic:
> 
>   """
>   Author: David Ascher
>   Date: 10/25/99
>   ...
>   """
> 
>   def len(input):
>       """\
>       Description:
>           Returns the length of the input sequence.
> 
>       Arguments:
>          input (sequence) -- The input sequence
> 
>       Return Type: IntType
> 
>       See Also: len, length, ln
>       """
> 
> provided we make explicit exactly the format (just like Python syntax is
> formalized and parseable) and keep the fancy stuff (embedded URLs,
> hyperlinks, etc.) to a single escape code (e.g. like what Mark Hammond was
> pushing for months ago -- see June 1998 archives). That said, what I
> really care most about is a final decision, not the specific markup used.
> That's why I think that what we need is a Guido Stamp Of Approval.

Looks fine, but there is one catch: not everyone is going to
write his docstrings in English...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    34 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From da@ski.org  Sat Nov 27 17:17:37 1999
From: da@ski.org (David Ascher)
Date: Sat, 27 Nov 1999 09:17:37 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <383FD11E.D8ADE8D0@lemburg.com>
Message-ID: <Pine.WNT.4.05.9911270912070.186-100000@david.ski.org>

On Sat, 27 Nov 1999, M.-A. Lemburg wrote:

> BTW, in case someone cares, the format I use for docstrings and
> function/method signature goes as follows:
> 
> def normlist(jlist,
>                    
>              StringType=types.StringType):
> 
>     """ Return a normalized joinlist.
> 
>         All tuples in the joinlist are turned into real strings.  The
>         resulting list is a equivalent copy of the joinlist only
>         consisting of strings.
>         
>     """
>     ...
> 
> 1. Localizations are split from the true input arguments by
>    an empty line or a comment line

What's a localization?  Do you really mean L10N stuff?  FWIW, I think that
using whitespace in the non-docstring source as a significant delimiter
limits things, as it means that the encoding is not readable from the
parse tree.

> > Straw Proposal 0.1 [da]:
> > 
> >   """
> >   <AUTHOR>David Ascher</AUTHOR>
> >   <VERSION>1.0</VERSION>
> >   <DATE>20/10/96</DATE>
> >   <DESCRIPTION>This is a module with one function in it.</DESCRIPTION>
> >   <URI>...</URI>
> >   """

> Are you serious about the above ??? Noone is going to write that
> in his docstrings...

It's not my favorite, but Uche mentioned that XML-ish syntax is much
easier to parse.  While I don't really grant that point (or rather I think
that the hill needs to be climbed once for all), I want to emphasize:

   What I really care most about is a final decision, not the specific
   markup used. 

> Looks fine, but there is one catch: not everyone is going to
> write his docstrings in English...

So add another keyword in the module doctring:

  Language: Francais-France

--david


From Manuel Gutierrez Algaba <irmina@ctv.es>  Sat Nov 27 19:14:45 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Sat, 27 Nov 1999 19:14:45 +0000 (GMT)
Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion
In-Reply-To: <3.0.6.32.19991127105228.0093c4b0@gpo.iol.ie>
Message-ID: <Pine.LNX.3.95.991127185014.759A-100000@localhost>

On Sat, 27 Nov 1999, Sean McGrath wrote:

> [Manuel Gutierrez Algaba]
> >
> >It happens that TeX documents are the easiest structure to represent
> >things, apart from raw .txt  A basically TeX markup is just 
> >the easiest markup. 
> >
> Completely subjective and unsubstantiated statements!
> 
> >> 
> >>   def foo():
> >>      print 'a'
> >
> >This example is too simple , it gots almost no info at all , 
> >perhaps : \indexbasicdeffunction
> >(\index{basic}\index{def}\index{function} )
> >after doing this , anyone searching  for basic things, or for
> >definitions of for functions may find the piece of code you've done,
> >your WORK is PROFITABLE now. That piece of CODE IS VALUABLE for 
> >the rest of community.
> 
> In my opinion, there is *NO CHANCE* that any developer would
> voluntarily add th index stuff in this example isn't it
> redundant? A text parsing tool that knows the syntax of
> Python can work out that foo is a function. There is no need
> for a programmer to spell it out in a second
> syntactic form right?

Ufffffff! This example *ONLY* tried to show David Asher that 
*anything* can be documented/reused/usable using it!

Regards/Saludos
Manolo
www.ctv.es/USERS/irmina    /TeEncontreX.html   /texpython.htm
 /SantisimaInquisicion/index.html 

  Disease can be cured; fate is incurable. -- Chinese proverb


From Manuel Gutierrez Algaba <irmina@ctv.es>  Sat Nov 27 19:14:57 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Sat, 27 Nov 1999 19:14:57 +0000 (GMT)
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <383FD11E.D8ADE8D0@lemburg.com>
Message-ID: <Pine.LNX.3.95.991127185135.759B-100000@localhost>

On Sat, 27 Nov 1999, M.-A. Lemburg wrote:
> > 1) The current Python documentation is, in my opinion, just fine.  I think

It's fine if you read it all of it, NOT FOR SEARCHING, FOR SEARCHING
IS NOT FINE.
> >    There is at least one proposal to index in-code Python docstrings
> >    with TeX-like commands.  In my opinion, anything that full of
> >    backslashes and braces will never fly in the Python community.
> 
> I don't think people will start to write TeX in their docstrings...
> after all not everyone can read plain TeX and will get pretty
> confused about all those backslashes and curly brackets.

Ok, I think the syntax I proposed is quite bad ( from your comments),
instead of \newcommand{\indexalfa}{\index{alfa}} and \indexalfa
why not ?
<@indexalfa,alfa>  and <#alfa>

It's the SAME ! SantisimaInquisicion/TeEncontreX is NOT, I say,
is NOT, TeX. It was TeX some billion years ago!

> 
> IMHO, a clean plain text approach goes much further; together
> with some conventions on how to format this text and intelligent
> tools to extract the information encoded by those conventions
> will certainly make the writing docstrings much more popular.

Two big problems: tight conventions and intelligent tools. 
It seems to me hard stuff, for use and for programm.

> BTW, in case someone cares, the format I use for docstrings and
> function/method signature goes as follows:
...
>         All tuples in the joinlist are turned into real strings.  The
>         resulting list is a equivalent copy of the joinlist only
>         consisting of strings.
>         
>     """

My method can be used for USENET post, FAQ, .py, and *anything*
in ASCII form. Yours seem just a signature-teller, that is fine BTW,
but it's not the idea I'm proposing,

I'm just proposing to focus in the Semantic in the Meaning, in the
What ( a function, module, post, whatever...) does.

> > 3) Programmers in general, smart programmers especially, try to "think
> >    out" all of the possible uses for something before they start to design
> >    it.  That's why God Invented Managers and deadlines.  We need one or
> >    the other.
> 
> Right. And it's even worse in the Python community: they first try
> to prove NP-completeness rather than think about good reasonable
> approaches for the common case.

If you spent half an hour, just, attributing your own code or a
FAQ with the \indexblabla stuff, you'd be ashtoundingly surprised
of :
- how fast is it
- how powerful/flexible
- how much can it help others understand what you've done.

It seems to me you don't want to even try to understand my proposal.
It's damned simple and direct, but of course, if you don't make 
the try of thinking/understanding ... then ...!

> 
> Looks fine, but there is one catch: not everyone is going to
> write his docstrings in English...

My system, by default , can handle any kind of language...

Pity that you don't make the try of understanding it. In 10 minutes
you'd get the whole functioning of it all!

Regards/Saludos
Manolo
www.ctv.es/USERS/irmina    /TeEncontreX.html   /texpython.htm
 /SantisimaInquisicion/index.html 

  Disease can be cured; fate is incurable. -- Chinese proverb


From Manuel Gutierrez Algaba <irmina@ctv.es>  Sat Nov 27 19:15:15 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Sat, 27 Nov 1999 19:15:15 +0000 (GMT)
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <Pine.WNT.4.05.9911270912070.186-100000@david.ski.org>
Message-ID: <Pine.LNX.3.95.991127190330.759C-100000@localhost>

On Sat, 27 Nov 1999, David Ascher wrote:

"""
\jaja
\jiji
> > > 
> > >   """
> > >   <AUTHOR>David Ascher</AUTHOR>
> > >   <VERSION>1.0</VERSION>
> > >   <DATE>20/10/96</DATE>
> > >   <DESCRIPTION>This is a module with one function in it.</DESCRIPTION>
> > >   <URI>...</URI>
> > >   """
> 
>    What I really care most about is a final decision, not the specific
>    markup used. 
\indexfinaldecision \indexmarkupdiscussion \indexexampleXML

\jiji
"""


My proposal is so flexible that it could live with any other marking.

And USENET post and emails can be sorted in source so they'd be 
reusable. That's why I say it's a kind of XML, because we 
can reuse if we can handle the info. Please make the try of 
understanding, I'm proposing something far more powerful than 
javadoc, It's two orders of magnitude higher level!

And we can get set of compatible indexes , because the biggest 
problem of this is when we get too many indexes, but then It'll
be a success, not a real problem. 

Just imagine, having all comp.lang.python attributed and reusable,
I think somebody has made a book doing this, I propose lets make 
the book ourselves, little by little, not by chapter, but by
concepts, by families of concepts, ... a kind of book, 
but upside-down: the indexes are the chapters.

And please forget javadoc-ish stuff, I'm not talking about that. 
This is much better!

Regards/Saludos
Manolo
www.ctv.es/USERS/irmina    /TeEncontreX.html   /texpython.htm
 /SantisimaInquisicion/index.html 

  Disease can be cured; fate is incurable. -- Chinese proverb


From irmina@ctv.es  Sat Nov 27 19:17:44 1999
From: irmina@ctv.es (Manuel Gutierrez Algaba)
Date: Sat, 27 Nov 1999 19:17:44 +0000 (GMT)
Subject: [Doc-SIG] Success of TeEncontreX
Message-ID: <Pine.LNX.3.95.991127191559.1601A-100000@localhost>

TeEncontreX is registered in freshmeat.net where it has got 
769 hits ( webpage ) and 148 (downloads ) in a month.

This makes me thing the idea is not bad at all. But the effort
in the case of SantisimaInquisicion will be useless unless many
people support it. 

Regards/Saludos
Manolo
www.ctv.es/USERS/irmina    /TeEncontreX.html   /texpython.htm
 /SantisimaInquisicion/index.html 

  What we Are is God's gift to us. What we Become is our gift to God.


From da@ski.org  Sat Nov 27 22:45:00 1999
From: da@ski.org (David Ascher)
Date: Sat, 27 Nov 1999 14:45:00 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <Pine.LNX.3.95.991127190330.759C-100000@localhost>
Message-ID: <Pine.WNT.4.05.9911271435470.61-100000@david.ski.org>

On Sat, 27 Nov 1999, Manuel Gutierrez Algaba wrote:

> My proposal is so flexible that it could live with any other marking.

Manuel, I am somewhat at a loss as to what your proposal is.  Can you
describe it more precisely, without beliefs such as "it's more powerful"
or "This is much better" but rather with a precise definition of exactly
what it is you're proposing?  Just looking at the website doesn't really
help me at least.  

Are you proposing:

  1) a markup syntax (e.g. \newcommand vs <NEWCOMMAND> vs ...)?
  2) a set of tags (e.g. function, extensionmodule, usenetpost, ...)?
  3) something else?

I gather that all documents can be indexed with your system, and that you
do not intend to propose a specific set of indexing 'keywords'.  That
seems to fly in the face of decades if not hundreds of years of prior art
which shown the success of domain-specific keyword lists. Can you try
again, please, without hyperbolae?

--david


From Manuel Gutierrez Algaba <irmina@ctv.es>  Sun Nov 28 12:28:51 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Sun, 28 Nov 1999 12:28:51 +0000 (GMT)
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <Pine.WNT.4.05.9911271435470.61-100000@david.ski.org>
Message-ID: <Pine.LNX.3.95.991128111401.730A-100000@localhost>

On Sat, 27 Nov 1999, David Ascher wrote:

> On Sat, 27 Nov 1999, Manuel Gutierrez Algaba wrote:
> 
> > My proposal is so flexible that it could live with any other marking.
> 
> Manuel, I am somewhat at a loss as to what your proposal is.  Can you
> describe it more precisely, without beliefs such as "it's more powerful"
> or "This is much better" but rather with a precise definition of exactly
> what it is you're proposing?  Just looking at the website doesn't really
> help me at least.  
> 
> Are you proposing:
> 
>   1) a markup syntax (e.g. \newcommand vs <NEWCOMMAND> vs ...)?
>   2) a set of tags (e.g. function, extensionmodule, usenetpost, ...)?
>   3) something else?
> 
> I gather that all documents can be indexed with your system, and that you
> do not intend to propose a specific set of indexing 'keywords'.  That
> seems to fly in the face of decades if not hundreds of years of prior art
> which shown the success of domain-specific keyword lists. Can you try
> again, please, without hyperbolae?

Hyperbolae are needed when you can't explain anything and then
you try to win the hearts instead the minds. In this email I won't
mention "simple", "easier" nor "better" nor "best". I'll explain
plainly the idea.

The current situation in python world is this:
- We have humans
- We have doc
- We have code

There's a relationship among humans-doc :
Humans want to poke information from docs, in fact, they need the
information to produce code, mainly. Because of that, code has got 
some kind of frozen information ( the one the coder used for building
it ).
There's a relationship among code-doc:
Code can be seen as a thing that may help to produce new code or 
to understand new code/code, the code itself is a special kind of 
doc. 

It's obvious that code can produce new code, but for that we need
usually the esence (the doc) of what that code does. We need that
esence because IMO code is just an implementation of an information,
even in a high level programming language, that code keeps personal
preferences or solutions. Those preferences are things like:

def inter(a,b):
   result = []
   for i in a:
      if i in b:
         result.append()
   return result

There're almost 20 versions of this code, using filter, map, 
default params, using dicts... For a very specific "idea"/info
( intersection of two list) we can have 20 versions of it. 

If I doc this function properly ala "javadoc", truly I have 
info about "one implementation" of an idea.But obviously when reusing
code we need two things: look for implementations of the idea we
have and adapt it to our code. The idea we are looking for is far
more important than the implementation itself, the implementation
may be adapted, and in the case of python, the implementation
of an idea ( params, code itself) is not so obscure to need a strong
doc support ( think about assembler or prolog or perl). 

Perhaps, you agree now with me in :
- that code is basically "frozen implementations" of ideas/info
- OO programming involves frequent reuse of code, frequent reuse
of ideas.
- it's important to deal with ideas
- we can't leave details of implementation for later

Once we've identified our main target: ideas
 ( how to reuse and handle)
the question is how to do it ?

Well, I say: If this code :

class....
    bla...
    bla....
    bla....

is the implementation of an idea, let's mark it explicitly with:
<#idea_A>

Once we've marked it we now see if that solves our question:
"how to reuse and handle ?"

Can I reuse and handle the idea of that code that has been marked
with <#idea_A>? 
This involves :
- is idea_A the idea I'm looking for?
   If yes: then it works
   If not but it's a similar idea (sorry If I tend to overuse
latin words, similar=very close)

Then the question is now "what is a similar idea to another?"
It seems, that similar (very close) implies that the ideas group
themselves into groups.

Let's think about it, let's think about a example with sockets :

socket, asyncronous, buffering, internet, CGI, server, telnet port,
RFC, timeout...

Apparently some ideas involve more basic ideas ( telnet port  for
example), but It seems that basically those ideas are widely general
and single-meaning. Is there any field whose ideas are not widely general
or single-meaning? Well, fortunately, we're not talking about
philosophy but about technical ideas.
Can you identify clearly ideas in the code? Lots of times, I guess.
But even in those cases that they're not clear or are not very
general ideas then It may happen:
- you're considering implementation information, not the idea that
that code implements
- your code solves a non common idea, but even so, that idea
will be related with any more common idea. 

If you've got this far, you see I'm talking about relationship
among ideas, not about code. Fred and Paul worry about those
relationships, they can be many, they could not be clear.

But even so, we won't know until we have the ideas , until we have
the problem. Then and even if that problem can't be properly solved
We'll have a library of ideas, spread over FAQ, USENET, code, HOWTO,
..., ready to be searched/compared/handled. Library of ideas
mean simply a library of doc/code ready to be used to generate 
new code.

Is going people to mark their code? Is it worth the effort ?

I guess, they're not. In fact, you can express a 
broad idea  with five lines of doc or with a single <#idea>, this
is valid for general ideas. But this requires a change in the mind
of people: - concise, direct, high-level. 

It's the same how is this done !
It's the same \indexbla than <#bla>

It's the same if somebody writes \indexsocket or \indexport
or  \indexcomms. We can relate each other, afterwards.

\indexsocket \indextelnet \indexexpect

is the same that:

\indextelnetexpect

We can have tools that unite different notations.

See this as a pyramid, the higher you are the less space there's. 

Is this "indexing" ? No, I use indexes and it seems indexing,
but It's a kind of Plato-python-world-building. Libraries of ideas,
not libraries of implementations.

It don't think it's a good idea to impose a "limited set of 
keywords", let people express freely, because in fact, in the field
we're working (high level ideas ) there's not too much space left.
Freedom in this level means "expressing" not "confusion".

Of course, "my idea" involves www-web pages ftp , pages of written
books,.... anything. This is like modern art, people like XVIII century
paintings because they can understand it, but modern art is richer
in concepts and information. Is people ready? If they're ready, the
place is here: python-world.

But, I'm rather pesimistic, people is lazy and brute. 

I can't give a list of all the possible ideas, nor give the
relationships... look at TeEncontreX ( the most basic implementation
of that "pyramid") it's easy to handle/search, I guess so. If
it'd be 10 times bigger it wouldn't be much harder to handle, but
it'd carry 10 times more info !!

My idea is a kind of "inverse-video" of Yahoo, and improved of 
course. 

Sorry , for the hyperbolae :) :P

Regards/Saludos
Manolo
www.ctv.es/USERS/irmina    /TeEncontreX.html   /texpython.htm
 /SantisimaInquisicion/index.html 

  Life can be so tragic -- you're here today and here tomorrow.


From mal@lemburg.com  Sun Nov 28 22:27:23 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 28 Nov 1999 23:27:23 +0100
Subject: [Doc-SIG] On David Ascher's Rant
References: <Pine.WNT.4.05.9911270912070.186-100000@david.ski.org>
Message-ID: <3841AC4B.3A29C921@lemburg.com>

David Ascher wrote:
> 
> On Sat, 27 Nov 1999, M.-A. Lemburg wrote:
> 
> > BTW, in case someone cares, the format I use for docstrings and
> > function/method signature goes as follows:
> >
> > def normlist(jlist,
> >
> >              StringType=types.StringType):
> >
> >     """ Return a normalized joinlist.
> >
> >         All tuples in the joinlist are turned into real strings.  The
> >         resulting list is a equivalent copy of the joinlist only
> >         consisting of strings.
> >
> >     """
> >     ...
> >
> > 1. Localizations are split from the true input arguments by
> >    an empty line or a comment line
> 
> What's a localization?  Do you really mean L10N stuff?  FWIW, I think that
> using whitespace in the non-docstring source as a significant delimiter
> limits things, as it means that the encoding is not readable from the
> parse tree.

No, I meant the StringType=types.StringType part: it localizes
symbols which would otherwise be looked up in the global name-
space. I often do this to speed up routines which deal with
static APIs like string.split and string.join, e.g.

def f(x,

      split=string.split,join=string.join):

    ...


The l10n stuff is something which will appear in Python 1.6 --
hopefully that is ;-)

> > > Straw Proposal 0.1 [da]:
> > >
> > >   """
> > >   <AUTHOR>David Ascher</AUTHOR>
> > >   <VERSION>1.0</VERSION>
> > >   <DATE>20/10/96</DATE>
> > >   <DESCRIPTION>This is a module with one function in it.</DESCRIPTION>
> > >   <URI>...</URI>
> > >   """
> 
> > Are you serious about the above ??? Noone is going to write that
> > in his docstrings...
> 
> It's not my favorite, but Uche mentioned that XML-ish syntax is much
> easier to parse.  While I don't really grant that point (or rather I think
> that the hill needs to be climbed once for all), I want to emphasize:
> 
>    What I really care most about is a final decision, not the specific
>    markup used.

I guess doc strings are just as personal to the programmer
as indention or naming styles: you won't get everybody to agree
on one way to do it.

Besides, I don't think this is really needed: as long as the
programmer can provide routines to parse his code everything
should be fine. This could e.g. be implemented by subclassing
a reader implementation which then passes the parsed tokens
to other code processing them for some other use.

Of course, you could provide a few standard markup schemes,
e.g. an XML one and StructuredText one.

> > Looks fine, but there is one catch: not everyone is going to
> > write his docstrings in English...
> 
> So add another keyword in the module doctring:
> 
>   Language: Francais-France

I was referring to "Language:" being English :-) E.g. my
doc strings in German would look quite silly if I would
insert some English markers in there...

But one could of course simply define a few sets of these
markers which then get chosen by a command line option --
one for each language. Or perhaps simply look for all of them.

Well, anyway, these are just some ideas. I'm not going to
code anything or proceed discussing these things. Fred
is doing a great job and I'll continue to document my
code by hand. Perfect for me ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    33 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Sun Nov 28 22:16:14 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 28 Nov 1999 23:16:14 +0100
Subject: [Doc-SIG] On David Ascher's Rant
References: <Pine.LNX.3.95.991127185135.759B-100000@localhost>
Message-ID: <3841A9AE.9D00FEB8@lemburg.com>

Manuel Gutierrez Algaba wrote:
> 
> On Sat, 27 Nov 1999, M.-A. Lemburg wrote:
> > > 1) The current Python documentation is, in my opinion, just fine.  I think
> 
> It's fine if you read it all of it, NOT FOR SEARCHING, FOR SEARCHING
> IS NOT FINE.

Hmm, not sure I can follow you here: there is a very nice index
which helps you pin-point most details and if you use the PDF
version you even get full-text search at no extra cost.

> > >    There is at least one proposal to index in-code Python docstrings
> > >    with TeX-like commands.  In my opinion, anything that full of
> > >    backslashes and braces will never fly in the Python community.
> >
> > I don't think people will start to write TeX in their docstrings...
> > after all not everyone can read plain TeX and will get pretty
> > confused about all those backslashes and curly brackets.
> 
> Ok, I think the syntax I proposed is quite bad ( from your comments),
> instead of \newcommand{\indexalfa}{\index{alfa}} and \indexalfa
> why not ?
> <@indexalfa,alfa>  and <#alfa>

This would probably make things a little less TeX-like.
 
> It's the SAME ! SantisimaInquisicion/TeEncontreX is NOT, I say,
> is NOT, TeX. It was TeX some billion years ago!

Well, it sure looks a lot like TeX. Believe me, I've written
LaTeX and TeX for many years -- I know that people don't like
it. Even I had my troubles with it at first. The syntax simply
isn't compatible with human reading habits and this is basically
what doc strings are all about: online help.
 
> > IMHO, a clean plain text approach goes much further; together
> > with some conventions on how to format this text and intelligent
> > tools to extract the information encoded by those conventions
> > will certainly make the writing docstrings much more popular.
> 
> Two big problems: tight conventions and intelligent tools.
> It seems to me hard stuff, for use and for programm.

The conventions need not be too tight. I've been using the
ones I mentioned for some time now and incorporated some of
it in my doc.py tool (which you can find on my Python Pages).
Works fine... for me at least.

> > BTW, in case someone cares, the format I use for docstrings and
> > function/method signature goes as follows:
> ...
> >         All tuples in the joinlist are turned into real strings.  The
> >         resulting list is a equivalent copy of the joinlist only
> >         consisting of strings.
> >
> >     """
> 
> My method can be used for USENET post, FAQ, .py, and *anything*
> in ASCII form. Yours seem just a signature-teller, that is fine BTW,
> but it's not the idea I'm proposing,

Right. The intention is to extract data from python scripts,
nothing more.
 
> I'm just proposing to focus in the Semantic in the Meaning, in the
> What ( a function, module, post, whatever...) does.
> 
> > > 3) Programmers in general, smart programmers especially, try to "think
> > >    out" all of the possible uses for something before they start to design
> > >    it.  That's why God Invented Managers and deadlines.  We need one or
> > >    the other.
> >
> > Right. And it's even worse in the Python community: they first try
> > to prove NP-completeness rather than think about good reasonable
> > approaches for the common case.
> 
> If you spent half an hour, just, attributing your own code or a
> FAQ with the \indexblabla stuff, you'd be ashtoundingly surprised
> of :
> - how fast is it
> - how powerful/flexible
> - how much can it help others understand what you've done.
> 
> It seems to me you don't want to even try to understand my proposal.
> It's damned simple and direct, but of course, if you don't make
> the try of thinking/understanding ... then ...!

Of course I have tried to get the idea... from what I understood
I can say, that I don't like the syntax you use. The system
itself may have its merrits, but the syntax is a bummer, IMHO.
 
Something about the general idea of automatic documentation:
I have tried to proceed in that direction a few years ago to
document my Python code. From that experience I can say that
automatic documentation -- for me at least -- only serves as
aid in finding APIs etc. fast during the programming phase.
It is not useable as final documentation. All my packages
include HTML documentation which is carefully crafted to include
all those things which I intend to publish and deliberately
leave parts undocumented or only partially documented. This
is not easily possible using automatic documentation or
other literate programming approaches. The written docs simply
are different because they focus on a different intent (and
sometimes even a different audience).

> > Looks fine, but there is one catch: not everyone is going to
> > write his docstrings in English...
> 
> My system, by default , can handle any kind of language...

Hmm, what language do "jiji" and "jaja" come from ? \newcommand
also sounds very English ;-)

How about making up some more programming compatible
tags to delimit code from docs, e.g. #doc and #/doc...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    33 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From d@pobox.com  Sun Nov 28 23:02:17 1999
From: d@pobox.com (David Arnold)
Date: Mon, 29 Nov 1999 09:02:17 +1000
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: Your message of "Fri, 26 Nov 1999 21:13:31 PST."
 <Pine.WNT.4.05.9911261910450.58-100000@david.ski.org>
Message-ID: <199911282302.JAA13867@piglet.dstc.edu.au>

-->"David" == David Ascher <da@ski.org> writes:

  David> IMHO, the single most problematic aspect of Python
  David> documentation is the lack of a standard way for programmers
  David> to document their code inside the .py file, unlike e.g. POD.,
  David> and that is a shame.

agreed.

  David> There is at least one proposal to index in-code Python
  David> docstrings with TeX-like commands.  In my opinion, anything
  David> that full of backslashes and braces will never fly in the
  David> Python community.

are you refering to gendoc/settext from a few years ago?

  David> Straw Proposal 0.1 [da]:

some feedback:

- i believe that the use of "special" string variables is more
  immediately useful, and maybe more "pythonic", than XML in a
  docstring for module-level stuff like this.

  eg.

  __author__ = "David Ascher"
  __version__ = "$Revision$[11:-2]
  __date__ = "$Date$

- my personal preference would be for the RFC 822-style "tag: value"
  format over XML.  it's about equal for parsing, but significantly
  better for humans to read

- similarly, i think i'd prefer a simple, punctuation character-based
  markup over XML for method/function comments.  i really find XML
  annoying to read.


  David> That said, what I really care most about is a final decision,
  David> not the specific markup used.

agreed.  we've been dithering for years, with software being developed
to support various proposals, but never really being given the Guido
Stamp Of Approval(tm).


  David> PS: I'll pay for a new melted-wax seal if Guido lost the old
  David> one. =)

i'll chip in a coupla bucks too ;-)


d


From da@ski.org  Mon Nov 29 00:48:54 1999
From: da@ski.org (David Ascher)
Date: Sun, 28 Nov 1999 16:48:54 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] Re: Docstrings [was: On David Ascher's Rant]
In-Reply-To: <199911282302.JAA13867@piglet.dstc.edu.au>
Message-ID: <Pine.WNT.4.05.9911281532040.189-100000@david.ski.org>

(Sorry, but I felt the need to change the Subject line.  My wife looked at
my inbox and said "you're ranting?")

On Mon, 29 Nov 1999, David Arnold wrote:
>   David> There is at least one proposal to index in-code Python
>   David> docstrings with TeX-like commands.  In my opinion, anything
>   David> that full of backslashes and braces will never fly in the
>   David> Python community.
> 
> are you refering to gendoc/settext from a few years ago?

No, I was referring to Manuel's proposal, which I obviously misunderstood.
I don't recall the gendoc/settext proposal.

> - i believe that the use of "special" string variables is more
>   immediately useful, and maybe more "pythonic", than XML in a
>   docstring for module-level stuff like this.

>   eg.
> 
>   __author__ = "David Ascher"
>   __version__ = "$Revision$[11:-2]
>   __date__ = "$Date$

> - my personal preference would be for the RFC 822-style "tag: value"
>   format over XML.  it's about equal for parsing, but significantly
>   better for humans to read

I think it's important to note that these will still be in strings, and
that one should not confuse them with code.  

Some reactions to this weekend's posts:

1) Manuel's proposal is interesting, but IMHO much broader in scope than
   what I feel is needed and doable.  Indexing ideas is a
   larger-than-encyclopedic endeavor, and I could but won't argue its
   impracticality on statistical grounds alone.  Suffice it to say that I
   agree with Manuel that people are lazy and that it won't work. 

   More positively, Manuel, do you agree that the kind of markup that I
   advocate (a la POD/javadoc) is a subset of yours (in other words that
   "Author" and "Argument 1" are 'trivial ideas', hence belong to the set
   of ideas? And that if this 'minimal' markup is used, then you can use
   the tags too along with the other, higher-level notions that you
   propose?

2) So far, folks seem to like a 'lightweight' structure for docstrings.
   (note that by lightweight I do not mean ambiguous or vague [*]).
   MAL wants to allow multiple formats as long as the person writing the
   docstring writes a parser for his/her specific format.  That's fine
   with me as long as there is a format which we can assume is readable by
   default without *having* to write such a parser.  Uche has mentioned
   that he 1) agrees that XML shouldn't be imposed on Python Authors, and
   that 2) XML is easier to parse than StructuredText.  While I grant him
   both, I'd like his reaction to this specific point.  

   I would like to claim that we can define a format which is
    - easily learned
    - easily parsed and debugged
    - rich enough
    - extensible enough
    - pleasing to the eye

   I will make a concrete proposal in a seperate message titled "docstring
   grammar".

3) The i18n issue is IMHO a red herring.  We can allow a 'keyword
   renaming' facility so that I can start a module with:

	import docstring
	docstring.set_language('Francais', 'Quebec')

   and then I can use Quebecois keywords in the doc, as long as there was
   a table mapping the default (US-English) keywords to the Quebecois
   keywords. Would that be OK with you, Marc-Andre?  After all, 'import'
   is an English word, and no one complains about that.  I once programmed
   in a version of Basic where the keywords were translated in French, and
   I can testify that it was a massive failure.

--david 

[*]: I have found StructuredText as implemented in StructuredText.py to be
     somewhat vague and non-trivial to use to produce exactly formatted
     doc.  This is probably due to its bigger aims than what I have in
     mind for this.   


From da@ski.org  Mon Nov 29 00:57:03 1999
From: da@ski.org (David Ascher)
Date: Sun, 28 Nov 1999 16:57:03 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
Message-ID: <Pine.WNT.4.05.9911281649360.202-100000@david.ski.org>

Proposed format for docstrings:

  The whitespace at the beginning of a docstring is ignored.

  Paragraphs are separated by one or more blank lines.

  For compatibility with Guido, IDLE and Pythonwin (and increasing the
  likelihood that the proposal will be accepted by GvR), the
  docstrings of callables must follow the following convention
  established in Python's builtins:

       >>> print len.__doc__
       len(object) -> integer

       Return the number of items of a sequence or mapping.

    In other words, the first paragraph must fit on a line, repeat the
    name of the callable, with a 'wordy' signature, the ' -> ' string,
    and the type of the return value.  The second paragraph must be a
    one-sentence description of the callable.  It is also allowed to
    have those two bits separated by a " -- " string:

      >>> print [].pop.__doc__
      L.pop([index]) -> item -- remove and return item at index (default last)

    and functions which don't return anything can omit the " -> foo"
    bit:

      L.append(object) -- append object to end

  Each paragraph is either 'text' or a 'keyword-tagged block'.  

  A keyword is a case-sensitive element of [a-zA-Z_]+ followed by two 
  colons (with optional whitespace between the keyword and the colons,
  but no whitespace allowed between the two colons).

  A paragraph which doesn't start with a keyword is 'text'.  

  Characters between # signs and the end of the line are stripped by
  the docstring parser.

  A 'keyword-tagged block' is nested much like Python code.  Just like
  in Python, the block can either be on the same line as the keyword
  if it is one-line long (I'll refer to such blocks as 'text' blocks
  even though they aren't in visual paragraphs), or needs to be
  indented relative to the keyword.

    Examples:

      Author:: Guido van Rossum   # comments are stripped

      Date_of_release :: 1/1/1999  # The key is "Date_of_release" and the
                                   # whitespace before the : is stripped

      Contributors::               # The value is a block of lines.

          John Doe

          Ronald Reagan

          Francois Mitterand
 
    Some keywords can have special parsing rules, as the block of text
    which the keyword designates is well-specified by the rules above.
    The first example of such a keyword-specific parsing rule is for
    Arguments:

      Arguments::
   
        self -- instance
        input (sequence) -- the sequence which is being processed

     (the specific syntax of Arguments:: is left for a later discussion).

     Other candidates which can impose specific parsing rules are:
     ReturnType, Date, Version, etc.

  Text blocks can be followed by indented blocks as well -- those are
  'children' blocks of the outdented block.

  'text' blocks which start with * or - are tagged as 'bullet items'
  for rendering.  The bullet marker has to be consistent within a
  given level of indentation.

    Example:

       * this is one bullet
  
          - this is a sub-bullet

          - this is another sub-bullet

       * this is another bullet

  In text blocks, some strings are recognized as links:

     .foo in the docstring of a class will refer to the foo attribute
     of the class.  In the docstring of a method, it will refer to the
     foo attribute of the method's class.  In the docstring of a
     module it will refer to a function or class defined in that
     module

     foo.bar will refer to the bar attribute of foo, which will be
     looked up in the following namespaces in order: (to be determined)
  
     URL notation is automatically recognized.

     [foo] refers to the keyword 'foo' in the section 'References' of
     the current docstring.  [..] links cannot span multiple lines or
     contain whitespaces (as keywords can't). (in other words, if a
     [ is not matched by a ] in the same line or before a whitespace
     character is hit, then it is a syntax error.

     References::

       foo:: My Dissertation, University Press, 1902

  The set of keywords which are 'officially sanctioned' is:

    For module docstrings:

      [see Trove discussion for a good starting set -- this discussion
      has been had!]

    For class docstrings:

      [To be determined]

    For method docstrings:

      [To be determined]

    For function docstrings:

      [To be determined]

 
Miscellaneous Thoughts:

  I chose double-colon notation for keywords so that one can have text
  paragraphs which match the 'word:' notation without having them be
  interpreted as keywords.

  Does this proposal make docstrings whitespace-heavy -- the
  requirement to break each paragraph with a line of whitespace
  means that a lot of lines are blank, especially when doing
  'bulleted lists'

  The above was (quickly) written with parsing in mind.  Is it really
  easily parseable?  If not, what needs to be changed so that it is
  parseable?

  I also wanted to make sure that syntax errors could be flagged early and
  'localized' for aid in debugging.  I'm not sure that I did that
  carefully enough.

  Are there normal uses in docstrings where one wants to turn off the
  automatic link detection?

  Is there value in having string interpolation?  David Arnold mentioned

       __version__ = "$Revision$[11:-2]
       __date__ = "$Date$

    which raises some issues.  I don't think that having [11:-2]
    evaluated by the docstring parser is a wise idea.  However, I can
    imagine that the module author could do:

       __version__ = "$Revision$"[11:-2]

    in the Python code, and then

       Version:: %(__version__)s
 
    in the docstring and that such a simple string interpolation
    mechanism could have value.  I'm not sure it's worth the
    complication though.  What dictionary would be used to do the
    interpolation?

Hopefully constructively, 

--david

PS: It goes without saying that while I railed against design by
committee, I am of course hopeful for feedback, for technical reasons
(dummy, you forgot special cases X, Y and Z!) and because I realize that a
standards proposal needs at least broad agreement if not consensus to be
effective in the long run.  The sharper-eyed will note that I stacked the
deck in my favor in the above proposal by including what Guido does
naturally as valid in the proposed grammar.


From jack@oratrix.nl  Mon Nov 29 09:41:38 1999
From: jack@oratrix.nl (Jack Jansen)
Date: Mon, 29 Nov 1999 10:41:38 +0100
Subject: [Doc-SIG] docstring grammar
In-Reply-To: Message by David Ascher <da@ski.org> ,
 Sun, 28 Nov 1999 16:57:03 -0800 (Pacific Standard Time) ,
 <Pine.WNT.4.05.9911281649360.202-100000@david.ski.org>
Message-ID: <19991129094139.05ADF370CF2@snelboot.oratrix.nl>

Very nice proposal!

>   I chose double-colon notation for keywords so that one can have text
>   paragraphs which match the 'word:' notation without having them be
>   interpreted as keywords.

If you can get rid of this, and use single colon in stead, I would be 100% 
happy. As most of the keywords are fixed (only in the References section could 
I find user-defined keywords) this should be doable. And it would make the 
document that little bit more readable.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From tony@lsl.co.uk  Mon Nov 29 09:50:49 1999
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Mon, 29 Nov 1999 09:50:49 -0000
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <Pine.WNT.4.05.9911281649360.202-100000@david.ski.org>
Message-ID: <000701bf3a4f$375a85d0$f0c809c0@lslp7o.lsl.co.uk>

I would *love* to see a standard for doc strings, and although I've often
objected to specific proposals in the past, by now I'd take almost anything.
Well, no, that's NEVER true, but David's proposal doesn't cause *too* many
knee-jerk reactions...

David Ascher wrote:
>   Paragraphs are separated by one or more blank lines.

As you say later on, I think this does cause some over-use of whitespace...

>   Characters between # signs and the end of the line are stripped by
>   the docstring parser.

This is a Bad Thing - I have quite often needed to discuss things in doc
strings which include use of the "#" character - not least if I'm parsing a
little language that uses "#" as its comment character! So losing stuff thus
would be difficult. Either (a) why do we need comments in doc strings, or
(b) provide a way to escape the "#" character.

(Also, if one were using Tim Peter's "test using the doc string as template"
thingy, one needs to be able to put generic Python code in the doc strings,
and that means that stopping comment characters from going through to the
ultimate documentation may be a bad thing.)

>   A 'keyword-tagged block' is nested much like Python code.  Just like
>   in Python, the block can either be on the same line as the keyword
>   if it is one-line long

I *like* this.

>       Contributors::               # The value is a block of lines.
>
>           John Doe
>
>           Ronald Reagan
>
>           Francois Mitterand

but the above gets oververbose. I suppose one could instead use a list
syntax:

	Contributors::
		- John Doe
		- Ronald Reagan
		- Francois Mitterand

since I don't see the ambiguity in allowing the omission of the vertical
whitespace here, *if* one allows that some care would be needed with
hyphenation! (i.e., one can't allow one's hyphens to start a line, which is
awkward but probably not too bad). Another possibility might be to allow
"Python list" syntax - I started off disliking this, but over the last few
minutes it has grown on me:

	Contributors::
		[ John Doe,
		  Ronald Reagan,
		  Francois Mitterand ]

(again, highjacking Python's syntax).

>   Text blocks can be followed by indented blocks as well -- those are
>   'children' blocks of the outdented block.

And this solves my "I want a list item to have multiple paragraphs" problem,
which
has been a bugbear of mine in the past with other proposals... The exact
indentation of a second paragraph in a list item (whether aligned with the
bullet or the text) would need addressing later, but I don't much care
(provided it is with the text, of course).

>   'text' blocks which start with * or - are tagged as 'bullet items'
>   for rendering.  The bullet marker has to be consistent within a
>   given level of indentation.
>
>     Example:
>
>        * this is one bullet
>
>           - this is a sub-bullet
>
>           - this is another sub-bullet
>
>        * this is another bullet

Again, sometimes I'd like to allow the blank lines to be missing. Another
way to do this is to have a "special" character to introduce the bullet
items - so maybe instead:

	Example:
		@* this is one bullet
		   @- this is a sub-bullet

but that's horrible in its own way - maybe the white space is just what we
have to live with (I certainly WOULD live with it if it was the only thing
standing in the way of adopting the proposal!).

No, on thinking about it, I would vote for either:

	1) use of white space as David proposes
	   (pro: utter simplicity,
	    con: doesn't quite look as nice as I'd like)
	2) allow Python list syntax
	   (pro: emphasises this is for short lists,
	    con: a bit odd)
	3) detect bullet characters at the "start of line"
	   (pro: still fairly simple,
	    con: one has to take care about, e.g., dashes in text)
	   Ah - I just realised that negative numbers at the start of a line
	   probably kill that one...

Could we do numbered/lettered/named lists by, for instance:

	*1 This list item is numbered, and one expects all items
	   at this indentation in this list to be numbered

	   -a Ditto for "lettered" items in this list

	       @fred   And this sub-list has item names

         -2 This may well get flagged as a mistake

	*B Unless we're allowing the author to do odd things
	   if they like...

(is that simple enough?)

>   Is there value in having string interpolation?  David Arnold mentioned
>
>        __version__ = "$Revision$[11:-2]
>        __date__ = "$Date$

There's also a semi-convention I've seen where a module's doc string is also
used as its documentation for Unix commands, and one substitutes in
sys.argv[0] - i.e., the command used to invoke the script - as a string into
the "Usage:" line. It's a rather hacky trick, and perhaps not to worry about
too much.

> The sharper-eyed will note that I stacked the
> deck in my favor in the above proposal by including what Guido does
> naturally as valid in the proposed grammar.

Yea, go for it!

desparately hoping this will get off the ground, but with no time to do
anything more than comment on it, Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.demon.co.uk/
2 wheels good + 2 wheels good = 4 wheels good?
3 wheels good + 2 wheels good = 5 wheels better?
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)


From Edward Welbourne <eddyw@lsl.co.uk>  Mon Nov 29 11:42:48 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Mon, 29 Nov 1999 11:42:48 +0000
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <383FD11E.D8ADE8D0@lemburg.com>
References: <Pine.WNT.4.05.9911261910450.58-100000@david.ski.org>
 <383FD11E.D8ADE8D0@lemburg.com>
Message-ID: <E11sPCi-00056h-00@lsls4p>

>> 1) ... moving from LaTeX ... is a great idea ... Fred is doing a
>>    beautiful job ... thankless task.
> I second that.
Yup, one problem with silent majorities is a failure to say Thank You.

> I don't think people will start to write TeX in their docstrings...
Nope.  Even those of us with fond memories of it.

> IMHO, a clean plain text approach ...
and in my downright intemperate and opinionated arrogance there can be
no decent documentation format *but* plain text.  Everything else just
leads to mess, confusion and demands for `extensions' to support things
that ... oh sod it, you're all smart enough to understand the analogy:
the road to Hell is paved with good intentions.

> BTW, in case someone cares
Yay, another doc format ;*}
It contributes to the pool of fragments that'll go into the final recipe
and I bet Marc-Andre can write a trivial tool that parses *his* format
into whatever we settle on, so won't mind a bit if it doesn't look like
it.  Each of us can handle our own transitions just as soon as we agree
on a common target ...

> ... they first try to prove NP-completeness rather than ...
no, it's worse than that - we try to work out what would be needed for
Turing completeness while trying to keep it straightforward but are so
busy thinking about whether we can prove NP-complete that we end up
digressing indefinitely.

> Are you serious about the above ???

Of course he was - it was a perfectly serious straw man (and so well
designed for knocking down that you'd done it before you needed to).
He's showing you how bad it would all look if we did things that way.

I actually *like* HTML and used it in my doc strings for a while, but it
just looked wrong and ugly and it was cumbersome and just plain *not*
the right answer.

Why do folk have to say so much during the weekend when I'm not looking ?
Eventually I'll catch up with this proposal of David's that Tony says is
further down the list ...

Oh, and for the record, I refuse to accept David's apology for the rant.
I can't accept an apology for something and
 a) be grateful for
 b) admire
it at the same time,

	Eddy.


From mal@lemburg.com  Mon Nov 29 12:06:34 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Nov 1999 13:06:34 +0100
Subject: [Doc-SIG] docstring grammar
References: <Pine.WNT.4.05.9911281649360.202-100000@david.ski.org>
Message-ID: <38426C4A.ACC76AB5@lemburg.com>

David Ascher wrote:
> 
> Proposed format for docstrings:
> ...
>   Is there value in having string interpolation?  David Arnold mentioned
> 
>        __version__ = "$Revision$[11:-2]
>        __date__ = "$Date$
> 
>     which raises some issues.  I don't think that having [11:-2]
>     evaluated by the docstring parser is a wise idea.  However, I can
>     imagine that the module author could do:
> 
>        __version__ = "$Revision$"[11:-2]
> 
>     in the Python code, and then
> 
>        Version:: %(__version__)s
> 
>     in the docstring and that such a simple string interpolation
>     mechanism could have value.  I'm not sure it's worth the
>     complication though.  What dictionary would be used to do the
>     interpolation?

This raises the question of whether to parse or evaluate the
loaded module. Evaluation has the benefit of providing "automatic"
context, i.e. the symbols defined in the global namespace
are exactly the ones relevant for class definitions, etc. It
probably makes contruction of interdepence graphs a lot easier
to write. On the downside you have unwanted side effects due to
loading different modules.

Some notes on the proposal:

� Mentioning the function/method signature is ok, but sometimes
  not needed since e.g. the byte code has enough information to
  deduce the signature from it. This is not true for builtin
  function which is probably the reason for all builtin doc
  strings to include the signature.

� I would extend the reference scheme to a lookup in the module
  globals in case the local one (in the Reference section) fails.
  You could then write e.g. "For details see the [string] module."
  and the doc tool would then generate some hyperlink to the
  string module provided the string module is loaded into the
  global namespace.

� Standard symbols like __version__ could be included and used
  by the doc tool per default without the user specifying
  any special "Version:: %(__version__)s" % globals() tags.

BTW, for some code which does online formatting of the
doc strings, have a look at my hack.py script. It includes
a function called docs() which prints out all the information
it can find on the given target object.

Here's an example:

>>> docs(string.upper)
upper :
    upper(s) -> string
    
    Return a copy of the string s converted to uppercase.


>>> docs(string.zfill)
zfill(x, width) :
    zfill(x, width) -> string
    
    Pad a numeric string x with zeros on the left, to fill a field
    of the specified width.  The string x is never truncated.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    32 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From Edward Welbourne <eddyw@lsl.co.uk>  Mon Nov 29 12:48:16 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Mon, 29 Nov 1999 12:48:16 +0000
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <38426C4A.ACC76AB5@lemburg.com>
References: <Pine.WNT.4.05.9911281649360.202-100000@david.ski.org>
 <38426C4A.ACC76AB5@lemburg.com>
Message-ID: <E11sQE4-00059O-00@lsls4p>

MAL said:
7 I would extend the reference scheme to a lookup in the module
  globals in case the local one (in the Reference section) fails.
  You could then write e.g. "For details see the [string] module."
  and the doc tool would then generate some hyperlink to the
  string module provided the string module is loaded into the
  global namespace.

We have, it occurs to me, another important namespace: unimported
modules.  Thus the string module doesn't import re, I assume, but may
wish to refer to it (e.g. to say `this function is a cheap variant of
the eponymous one in re') in its doc-strings.  Fortunately, we also have
a handy name to hang this namespace off (which can't coincide with a
name in either of our namespaces): import.  Thus: `this function is a
cheap variant of import.re.search' could be sensible in doc strings.

Note, however, that some bypassing of this may be achieved using the
[blah] notation (which is good).

I have a problem with too much vertical white space, but I believe the
perturbations Tibs suggested (and which match what's in gendoc /
pythondoc - if my memory isn't disserving me again - so must be
feasible) suffice to deal with that.  I can make my editor window more
than a hundred columns wide if I want, and know that code lines jutting
past that are too long; but I still only get 55 lines in sight at the
same time, and real code often involves wanting to see more than that.
This situation gets badly exacerbated by being obliged to throw
gratuitous blank lines (though not as much as by my tendency to
verbosity).  But, like Tibs, I can live with the vspace if I must.

What happened to gendoc / pythondoc ?

	Eddy.


From mhammond@skippinet.com.au  Mon Nov 29 12:57:41 1999
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Mon, 29 Nov 1999 23:57:41 +1100
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <E11sPCi-00056h-00@lsls4p>
Message-ID: <004101bf3a69$535a09d0$0501a8c0@bobcat>

> >> 1) ... moving from LaTeX ... is a great idea ... Fred is doing a
> >>    beautiful job ... thankless task.
> > I second that.
> Yup, one problem with silent majorities is a failure to say Thank
You.

Me too - thanks Fred!  The doc is excellent and a thankless task!

Mark.


From Manuel Gutierrez Algaba <irmina@ctv.es>  Mon Nov 29 16:30:00 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Mon, 29 Nov 1999 16:30:00 +0000 (GMT)
Subject: [Doc-SIG] Lisp oriented docstrings (Manolo's strikes back. )
Message-ID: <Pine.LNX.3.95.991129155935.797A-100000@localhost>

Pity, pity and pity that you don't like the "python-encyclopedia"
idea, anyway... 
\indexauthor is a single idea, unnatributed, so David is not 
the same that : "Author: Walt Disney" 

But anyway, If you really want to mess with lowlevel stuff and that
still is interfering for higher things then I'll propose the definitive
answer to this nasty stuff of Authors,params and so on and so on.

The problem seems how to stablish the low level stuff: spaces,
syntax, colons,... Really nasty in my opinion. Python is not java,
and because of that we have the glorious list and the glorious eval.
Let's use them! 

def function_A( list_A, list_B ):
     """
	author('Guido van Rossum').date('1/1/1999').\
	contributors(["John Doe"]).santi(['\indexpollo',
        '\indexrojo'], radius = 6).arg(1, []).arg(2,[])
     """
Advantages of this approach: 
- Needs no parsing! Just eval(function_A.__doc__)
- No low level details
- Automatical i18n
- You can use default arguments ( author ) for all the 
functions of a module.
- If you know python you know how to write the docs
- All the flexibility and power of python 
- Nice syntax ( if you like python, of course !)
- Absolute extensibility and freedom of use ( you can make
default what you want, and omit what you want ).

author, date .... will return an object, lets name it :
Bandurria. The Bandurria objects get more and more attributions
and it can be affected by global switches when generating the 
final doc. 

def author(self, name):
   b = Bandurria()
   b.name(name)
   return b

the same with date ...

class Bandurria:
     def author(self, name):
          self.name = name
          return self 

the same with date ...


I hope you understand it the first time, if so, let's approve it
and let's face the real interesting thing: SantisimaInquisicion


Yes, MA Lemburgh, Jaja is Spanish and Jiji too, They're the sounds
of the laughter! But , that's low level stuff, not interesting at all!

The eagle seems small when flying high, but in fact when it stands
in the ground is a really big animal.

Regards/Saludos
Manolo
www.ctv.es/USERS/irmina    /TeEncontreX.html   /texpython.htm
 /SantisimaInquisicion/index.html 

  If something has not yet gone wrong then it would ultimately have been beneficial for it to go wrong.


From Manuel Gutierrez Algaba <irmina@ctv.es>  Mon Nov 29 16:30:13 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Mon, 29 Nov 1999 16:30:13 +0000 (GMT)
Subject: [Doc-SIG] Re: Docstrings [was: On David Ascher's Rant]
In-Reply-To: <Pine.WNT.4.05.9911281532040.189-100000@david.ski.org>
Message-ID: <Pine.LNX.3.95.991129162443.1837A-100000@localhost>

On Sun, 28 Nov 1999, David Ascher wrote:
> Some reactions to this weekend's posts:
> 
> 1) Manuel's proposal is interesting, but IMHO much broader in scope than
>    what I feel is needed and doable.  Indexing ideas is a
>    larger-than-encyclopedic endeavor, and I could but won't argue its
>    impracticality on statistical grounds alone.  Suffice it to say that I
>    agree with Manuel that people are lazy and that it won't work. 

But, if we can do it, then that'll be great!!!

>    More positively, Manuel, do you agree that the kind of markup that I
>    advocate (a la POD/javadoc) is a subset of yours (in other words that
>    "Author" and "Argument 1" are 'trivial ideas', hence belong to the set
>    of ideas? And that if this 'minimal' markup is used, then you can use
>    the tags too along with the other, higher-level notions that you
>    propose?

That's not my original idea, but if we can handle indexes with 
attributions, simple ideas when attributions, then yes!
The simplest syntax is just indexes, then indexes with attributes,
then indexes with complex attributes, then XML-ish stuff

Regards/Saludos
Manolo
www.ctv.es/USERS/irmina    /TeEncontreX.html   /texpython.htm
 /SantisimaInquisicion/index.html 

  If something has not yet gone wrong then it would ultimately have been beneficial for it to go wrong.


From Edward Welbourne <eddyw@lsl.co.uk>  Mon Nov 29 17:46:32 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Mon, 29 Nov 1999 17:46:32 +0000
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <Pine.WNT.4.05.9911281649360.202-100000@david.ski.org>
References: <Pine.WNT.4.05.9911281649360.202-100000@david.ski.org>
Message-ID: <E11sUsh-0005yR-00@lsls4p>

Manuel: if David includes `Keyword::' in his bits and pieces, would

Keyword::
     indexing
     keyword
     data retrieval
     searching

(within a doc-string) contain the information you've been wanting to
take out of your

\indexaboutindexing
\indexaboutkeyword
\indexretrieval
\indexsearching

etc. (with apologies for not having followed your system well enough to
mimic the names you'd actually use) ?  What I've understood of your
scheme appears to tell me the answer Yes.  If so, I guess you could just
slurp the Keyword slice out of a namespace-tree generated from
doc-strings and, I suspect, happiness would abound and confusion abate.

I know you have bits that define an indexing command that expands to
several indexing commands, which this lacks: but could the same effect
be arrived at by turning your set of indexing command definitions into
an `expert system' that expands some keywords ?

And ... to folk who know about the state of the craft of indexing: is
there a better way to go with this ?  After all, I'm pretty much just
borrowing from one of HTML's META tags here ...

Now, back to the spec itself:

> For compatibility with Guido, IDLE ... 
>        len(object) -> integer

i.e.
docstring-startline: archetypical-call [ '->' return ] ['--' summary ]

Quite apart from compatibility - this is a *good* approach.
I guess that could be why Guido does it ...

>  Each paragraph is either 'text' or a 'keyword-tagged block'.  
Sounds good.  Flesh and skeleton.

I'm with Tibs on the #-comment stuff - particularly the liberty to
simply embed a piece of python code in a doc string.

>   A 'keyword-tagged block' is nested much like Python code.
Yes, thank you very much, beautiful - this will give us scope for nested
sub-structures in the keyword-tagged data: in particular, get rid of
that Date_of_release ... use

Author:: David Ascher
Release::
    Date:: 1999/11/28
    Name:: post-gendoc-0.1
    Stability:: draft
etc.

I was initially confused about : or :: because your examples began with
the first keyword I'd thought of, namely Example, and only used one :
with that one, going on to :: for the rest - then I noticed that you
weren't offering it as an example keyword but using it to introduce your
list of examples.  While I would far sooner have only one :, those of us
advocating this need to watch for the danger that the parser will get
similarly confused between the author's use of `Example:' in the manner
of English idiom and in its keyword sense (and, of course, it isn't the
only word to worry about).  (The flip-side is: I can see myself getting
irritated by the need to say Example:: as a keyword immediately after
I've ended a paragraph with the word example ...)

Note: this keyword representation is isomorphic to XML via `the usual'
equivalences between (pythonic) indentation-structuring and the
begin-end style of structuring that C and XML use.

keyword: single-liner
->
<keyword>single-liner</keyword>

keyword: indent block dedent
->
<keyword>
block (possibly transformed down a bit itself)
</keyword>

>     Some keywords can have special parsing rules, 
coo, context-sensitive parsing ;^)
Good idea.  Lets some things only be keywords where they need to be ...

>   The above was (quickly) written with parsing in mind.  Is it really
>   easily parseable?  If not, what needs to be changed so that it is
>   parseable?
Well, the bulleting (and descriptive list stuff) has been explored
already in pythondoc / gendoc, so clearly it's all `within scope'.
Heh.  And between David and Tibs, surely we have the parsing technology ...

On the subject of vertical space ... I'd guess the parser won't need a
blank line between 
    * the end of a paragraph and

    * the start of its first indented subordinate ?

Though, indeed, I do want to take out the other blank line here, and I
thought gendoc managed that ...

>   Is there value in having string interpolation?
Yes.  Definitely.
I hadn't realised it was possible until you mentioned it, now I'm sure
it's Needed.

> Hopefully constructively, 
having had some time to think on it, I'd say Thoroughly so.

Hierarchical namespaces,
Context-sensitive parsing,
Mappable to XML but written like python,
Scope for indexing, and for arbitrary extension within sub-namespaces,
Conformance to the only important standard (Guido's de facto habits ;^)
Proposed by someone who knows how to write parsers ...
No need for the run-time system to bother with any of it
	(all hidden inside the doc string)

Thank you David,

	Eddy.
--
PS - David: you do realise, though, that the committee won't keep up the
momentum on this unless you ruthlessly play Gdo until he joins in ...


From da@ski.org  Mon Nov 29 17:51:59 1999
From: da@ski.org (David Ascher)
Date: Mon, 29 Nov 1999 09:51:59 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <E11sUsh-0005yR-00@lsls4p>
Message-ID: <Pine.WNT.4.04.9911290944390.263-100000@rigoletto.ski.org>

On Mon, 29 Nov 1999, Edward Welbourne wrote:

> I'm with Tibs on the #-comment stuff - particularly the liberty to
> simply embed a piece of python code in a doc string.

Agreed.  I am removing that bit about ignoring #'ed text from my proposal.

> I was initially confused about : or :: because your examples began with
> the first keyword I'd thought of, namely Example, and only used one :
> with that one, going on to :: for the rest - then I noticed that you
> weren't offering it as an example keyword but using it to introduce your
> list of examples.  While I would far sooner have only one :, those of us
> advocating this need to watch for the danger that the parser will get
> similarly confused between the author's use of `Example:' in the manner
> of English idiom and in its keyword sense.

After a little thought, I'm tempted to remove the :: requirement as well.
In my proposal, I think that using the : after Example was a mistake in
style.  If it was a heading then it should just be text w/o a colon. If it
was supposed to be more of a sentence then it should have been spelled
out, as in:

   For example, we can have:

The *intent* was, however, to avoid the 'danger' you note above.  I'm
still open to go either way, "safe" or "comfortable".

I forgot two markups:  *this* is bold and _this_ is italic.  Bold and
italic markups must begin and end within a paragraph (I'd say 'within a
sentence' but I don't want to complicate the parser with a sentence type).
No space allowed between *'s and _'s and their contents.

> On the subject of vertical space ... I'd guess the parser won't need a
> blank line between 
>     * the end of a paragraph and
> 
>     * the start of its first indented subordinate ?
> 
> Though, indeed, I do want to take out the other blank line here, and I
> thought gendoc managed that ...

By all means, we should borrow from gendoc if it's already solved those
issues.  I admit not to having looked deeply into gendoc.  I'll look into
this some more a bit later.

> Proposed by someone who knows how to write parsers ...

Uh?  Me?  No way.  You must be confusing me with someone else!

--david


From da@ski.org  Mon Nov 29 18:31:05 1999
From: da@ski.org (David Ascher)
Date: Mon, 29 Nov 1999 10:31:05 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <000701bf3a4f$375a85d0$f0c809c0@lslp7o.lsl.co.uk>
Message-ID: <Pine.WNT.4.04.9911291018130.263-100000@rigoletto.ski.org>

On Mon, 29 Nov 1999, Tony J Ibbs (Tibs) wrote:

> >   Characters between # signs and the end of the line are stripped by
> >   the docstring parser.
> 
> This is a Bad Thing - I have quite often needed to discuss things in doc

As I mentioned in another email, yes, you're right.

> (Also, if one were using Tim Peter's "test using the doc string as template"
> thingy, one needs to be able to put generic Python code in the doc strings,
> and that means that stopping comment characters from going through to the
> ultimate documentation may be a bad thing.)

This raises a deeper issue: introducing Python code in a docstring.  Such
text cannot be parsed like text because linebreaks, indentation etc. are
important.  Here's one idea which I like -- introduce a new keyword which
is the equivalent of HTML's <PRE> tag:

  Code:

    def foo(): ...
       return ...

In other words, Python code is just another kind of text, but the
processing rules applied to that block are different. The only restriction
is that the text in a Code: block *cannot* be outdented more than the
first line in the block.  The rendering in HTML would omit the label
"Code:" and instead change font to the monospace font or whatnot.

One related comment:  multiple instances of a given keyword can occur
within a docstring.

> [... on the issue of how to 'shorten' lists... ]
>
> No, on thinking about it, I would vote for either:
> 
> 	1) use of white space as David proposes
> 	   (pro: utter simplicity,
> 	    con: doesn't quite look as nice as I'd like)
> 	2) allow Python list syntax
> 	   (pro: emphasises this is for short lists,
> 	    con: a bit odd)
> 	3) detect bullet characters at the "start of line"
> 	   (pro: still fairly simple,
> 	    con: one has to take care about, e.g., dashes in text)
> 	   Ah - I just realised that negative numbers at the start of a line
> 	   probably kill that one...

How about another keyword?

  List:
     * foo
     * bar
     * spam

Again, such keywords would not be rendered in 'output formats' (HTML, PS,
etc.).

> There's also a semi-convention I've seen where a module's doc string is also
> used as its documentation for Unix commands, and one substitutes in
> sys.argv[0] - i.e., the command used to invoke the script - as a string into
> the "Usage:" line. It's a rather hacky trick, and perhaps not to worry about
> too much.

I'd rather leave that to the coder who does the if __name__ == '__main__'
code.  sys.argv is a runtime-built construct, and I think docstrings
should be dependent on compile-time information only.

--david


From mal@lemburg.com  Mon Nov 29 18:29:08 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Nov 1999 19:29:08 +0100
Subject: [Doc-SIG] docstring grammar
References: <Pine.WNT.4.04.9911290944390.263-100000@rigoletto.ski.org>
Message-ID: <3842C5F4.91B2C4BB@lemburg.com>

David Ascher wrote:
> 
> > I was initially confused about : or :: because your examples began with
> > the first keyword I'd thought of, namely Example, and only used one :
> > with that one, going on to :: for the rest - then I noticed that you
> > weren't offering it as an example keyword but using it to introduce your
> > list of examples.  While I would far sooner have only one :, those of us
> > advocating this need to watch for the danger that the parser will get
> > similarly confused between the author's use of `Example:' in the manner
> > of English idiom and in its keyword sense.
> 
> After a little thought, I'm tempted to remove the :: requirement as well.
> In my proposal, I think that using the : after Example was a mistake in
> style.  If it was a heading then it should just be text w/o a colon. If it
> was supposed to be more of a sentence then it should have been spelled
> out, as in:
> 
>    For example, we can have:
> 
> The *intent* was, however, to avoid the 'danger' you note above.  I'm
> still open to go either way, "safe" or "comfortable".

I'd suggest using '^ *[a-zA-Z_]+[a-zA-Z_0-9]*: *' as RE for
keywords, i.e. keywords are Python identifiers immediatly followed
by a colon starting a line of a doc string. That should avoid
most complications, I guess.

	For example: blablablba
and
	...long sentence..., for
	example :

would not be parsed as keywords, while

	Example: a=1;b=2

does fit the above definition (I don't see a problem with including
examples in the parsed sections, BTW... examples are often much
more intuitive to understand than complex definitions).

Something else:

How would the following be handled:

Arguments: file -- a file like object
	   mode -- file mode indicator as defined in [__builtin__.open]
Arguments: buffersize -- optional buffer size in bytes

that is, what happens if a keyword appears twice ? In the above
case an error should be raised, but sometimes this may be
useful:

Example:
	first multi-line example

Example:
	second multi-line example

Hmm, perhaps these two examples should be wrapped using bullets:

Examples:
	- first example spanning multiple lines
	- second example

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    32 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From da@ski.org  Mon Nov 29 18:48:45 1999
From: da@ski.org (David Ascher)
Date: Mon, 29 Nov 1999 10:48:45 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <38426C4A.ACC76AB5@lemburg.com>
Message-ID: <Pine.WNT.4.04.9911291033580.263-100000@rigoletto.ski.org>

On Mon, 29 Nov 1999, M.-A. Lemburg wrote:

> This raises the question of whether to parse or evaluate the
> loaded module. Evaluation has the benefit of providing "automatic"
> context, i.e. the symbols defined in the global namespace
> are exactly the ones relevant for class definitions, etc. It
> probably makes contruction of interdepence graphs a lot easier
> to write. On the downside you have unwanted side effects due to
> loading different modules.

Good point. Too many modules "do things" on import, some exceedingly
expensive. I have written modules where the import never ends, by design
=3D).  I'm afraid that parsing is all we can do safely with the Python code=
=2E
That does make interpolation much more delicate.  Maybe we can do
everything but string interpolation w/ parsing, and then defer string
interpolation until and if the module can be evaluated safely.  Somehow
we'd need to indicate to the docstring processor whether that evaluation
is safe or not.

> Some notes on the proposal:
>=20
> =B7 Mentioning the function/method signature is ok, but sometimes
>   not needed since e.g. the byte code has enough information to
>   deduce the signature from it. This is not true for builtin
>   function which is probably the reason for all builtin doc
>   strings to include the signature.

Right.  It's not true for builtins, extension module functions, and I'm
not sure how easy it is for JPython code.  I have no problem with somehow
making it easy to omit those in cases where the information can be
obtained through the bytecode.

> =B7 I would extend the reference scheme to a lookup in the module
>   globals in case the local one (in the Reference section) fails.
>   You could then write e.g. "For details see the [string] module."
>   and the doc tool would then generate some hyperlink to the
>   string module provided the string module is loaded into the
>   global namespace.

Sounds good to me!

> =B7 Standard symbols like __version__ could be included and used
>   by the doc tool per default without the user specifying
>   any special "Version:: %(__version__)s" % globals() tags.

Fine.  I think that falls somewhat outside of the 'docstring' proposal,
but I agree with it.

--david

PS: Marc-Andre, how do you get these nice bullet characters in your
    emails? What character is that? =3D)


From da@ski.org  Mon Nov 29 18:54:47 1999
From: da@ski.org (David Ascher)
Date: Mon, 29 Nov 1999 10:54:47 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <3842C5F4.91B2C4BB@lemburg.com>
Message-ID: <Pine.WNT.4.04.9911291050200.263-100000@rigoletto.ski.org>

On Mon, 29 Nov 1999, M.-A. Lemburg wrote:

> How would the following be handled:
> 
> Arguments: file -- a file like object
> 	   mode -- file mode indicator as defined in [__builtin__.open]

That, btw, is illegal -- the block must either be a single-line block or
an indented block.

> Arguments: buffersize -- optional buffer size in bytes
> 
> that is, what happens if a keyword appears twice ? In the above
> case an error should be raised, but sometimes this may be
> useful:

Agreed -- I made a similar point in another email which waved 'hi!' to
yours as they crossed somewhere over the atlantic. =)

> Example:
> 	first multi-line example
> 
> Example:
> 	second multi-line example
> 
> Hmm, perhaps these two examples should be wrapped using bullets:
> 
> Examples:
> 	- first example spanning multiple lines
> 	- second example

Depends on the case.  In a long docstring, one might want to have several
sections, each with Examples: subsections.

I propose that part of the definition of a keyword is (along with any
special parsing rules) whether it can be duplicated in a docstring.

--david


From friedrich@pythonpros.com  Mon Nov 29 19:10:19 1999
From: friedrich@pythonpros.com (Robin Friedrich)
Date: Mon, 29 Nov 1999 13:10:19 -0600
Subject: [Doc-SIG] docstring grammar
References: <Pine.WNT.4.04.9911291033580.263-100000@rigoletto.ski.org>
Message-ID: <002301bf3a9d$63542a80$f25728a1@UNITEDSPACEALLIANCE.COM>

Some people on this list should remember the development days of gendoc and
it's cleaner successor pythondoc written by Dan Larsson (gosh I hope I'm not
the only one)! This thread rehashes much of what has already been discussed.
We pleaded back then for ideas/opinions/hacked code to help improve the
working code Dan wrote but got little response. I'm glad to see folks
thinking along these lines again. Please take a look at pythondoc and use it
as a starting point for a full featured documentation generator. It uses the
structured text approach for doc string parsing, and has options for either
parsing the source or importing the module to gather metadata, (the later is
necessary to document C modules).
-Robin Friedrich
See:
http://starship.python.net/crew/danilo/


From da@ski.org  Mon Nov 29 19:22:45 1999
From: da@ski.org (David Ascher)
Date: Mon, 29 Nov 1999 11:22:45 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <002301bf3a9d$63542a80$f25728a1@UNITEDSPACEALLIANCE.COM>
Message-ID: <Pine.WNT.4.04.9911291109070.263-100000@rigoletto.ski.org>

On Mon, 29 Nov 1999, Robin Friedrich wrote:

> Some people on this list should remember the development days of gendoc and
> it's cleaner successor pythondoc written by Dan Larsson (gosh I hope I'm not
> the only one)! 

Yes, I remember it.  Thanks for the reminder and pointer, Robin!

> We pleaded back then for ideas/opinions/hacked code to help improve the
> working code Dan wrote but got little response. 

FWIW, I think that one problem gendoc/pythondoc had in terms of strategy
was that it was billed as a 'tool'.  I think that if we establish a
'blessed standard' then any standard-compliant tool has a guaranteed user
base, and has a far greater likelihood of long-term success.  Also, once
the format is documented, then folks who don't like gendoc or for whatever
reason want to do it 'their own way' can still do it in a compatible way.

I'll start digging in gendoc to see the differences between its format and
what I've been discussing.  I'd love to leverage it to build a reference
implementation.

Dan Larsson, are you reading this discussion?  We could use your
experience here!

--david


From friedrich@pythonpros.com  Mon Nov 29 19:44:23 1999
From: friedrich@pythonpros.com (Robin Friedrich)
Date: Mon, 29 Nov 1999 13:44:23 -0600
Subject: [Doc-SIG] docstring grammar
References: <Pine.WNT.4.04.9911291109070.263-100000@rigoletto.ski.org>
Message-ID: <002d01bf3aa2$23e9c8a0$f25728a1@UNITEDSPACEALLIANCE.COM>

http://www.python.org/sigs/doc-sig/status.html

Contains an old summary of the formatting rules for Structured Text use in
doc strings.

Oddly Dan's subscription to this list is disabled, probably from an old
address. The latest address I have for him is Daniel.Larsson@telia.com

----- Original Message -----
From: David Ascher <da@ski.org>
To: Robin Friedrich <friedrich@pythonpros.com>
Cc: <doc-sig@python.org>; Daniel Larsson
<Daniel.Larsson@vasteras.mail.telia.com>
Sent: Monday, November 29, 1999 1:22 PM
Subject: Re: [Doc-SIG] docstring grammar


> On Mon, 29 Nov 1999, Robin Friedrich wrote:
>
> > Some people on this list should remember the development days of gendoc
and
> > it's cleaner successor pythondoc written by Dan Larsson (gosh I hope I'm
not
> > the only one)!
>
> Yes, I remember it.  Thanks for the reminder and pointer, Robin!
>
> > We pleaded back then for ideas/opinions/hacked code to help improve the
> > working code Dan wrote but got little response.
>
> FWIW, I think that one problem gendoc/pythondoc had in terms of strategy
> was that it was billed as a 'tool'.  I think that if we establish a
> 'blessed standard' then any standard-compliant tool has a guaranteed user
> base, and has a far greater likelihood of long-term success.  Also, once
> the format is documented, then folks who don't like gendoc or for whatever
> reason want to do it 'their own way' can still do it in a compatible way.
>
> I'll start digging in gendoc to see the differences between its format and
> what I've been discussing.  I'd love to leverage it to build a reference
> implementation.
>
> Dan Larsson, are you reading this discussion?  We could use your
> experience here!
>
> --david
>


From mal@lemburg.com  Mon Nov 29 20:55:35 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Nov 1999 21:55:35 +0100
Subject: [Doc-SIG] docstring grammar
References: <Pine.WNT.4.04.9911291033580.263-100000@rigoletto.ski.org>
Message-ID: <3842E847.DE11400D@lemburg.com>

David Ascher wrote:
> 
> On Mon, 29 Nov 1999, M.-A. Lemburg wrote:
> 
> > This raises the question of whether to parse or evaluate the
> > loaded module. Evaluation has the benefit of providing "automatic"
> > context, i.e. the symbols defined in the global namespace
> > are exactly the ones relevant for class definitions, etc. It
> > probably makes contruction of interdepence graphs a lot easier
> > to write. On the downside you have unwanted side effects due to
> > loading different modules.
> 
> Good point. Too many modules "do things" on import, some exceedingly
> expensive. I have written modules where the import never ends, by design
> =).  I'm afraid that parsing is all we can do safely with the Python code.
> That does make interpolation much more delicate.  Maybe we can do
> everything but string interpolation w/ parsing, and then defer string
> interpolation until and if the module can be evaluated safely.  Somehow
> we'd need to indicate to the docstring processor whether that evaluation
> is safe or not.

I think gendoc did this with a command line switch... well the
early versions did (I think under a different name though, or
perhaps the name is different now ?).
 
> > Some notes on the proposal:
> >
> > � Mentioning the function/method signature is ok, but sometimes
> >   not needed since e.g. the byte code has enough information to
> >   deduce the signature from it. This is not true for builtin
> >   function which is probably the reason for all builtin doc
> >   strings to include the signature.
> 
> Right.  It's not true for builtins, extension module functions, and I'm
> not sure how easy it is for JPython code.  I have no problem with somehow
> making it easy to omit those in cases where the information can be
> obtained through the bytecode.

There's code in hack.py for the extraction and also a more
generic module by Fredrik Lundh for building signature strings.

> > � I would extend the reference scheme to a lookup in the module
> >   globals in case the local one (in the Reference section) fails.
> >   You could then write e.g. "For details see the [string] module."
> >   and the doc tool would then generate some hyperlink to the
> >   string module provided the string module is loaded into the
> >   global namespace.
> 
> Sounds good to me!

Without too much parsing overhead this only works for
the evaluation technique though. Would be nice to have...
even if it doesn't work for some reason (the doc tool could
then just produce some different markup for the reference
string, e.g. put it in italics).
 
> > � Standard symbols like __version__ could be included and used
> >   by the doc tool per default without the user specifying
> >   any special "Version:: %(__version__)s" % globals() tags.
> 
> Fine.  I think that falls somewhat outside of the 'docstring' proposal,
> but I agree with it.

True. It's something I've added to my hack.py formatting
functions and I thought it would be nice to have... (it also
encourages people to use __version__).
 
> --david
> 
> PS: Marc-Andre, how do you get these nice bullet characters in your
>     emails? What character is that? =)

It's chr(183) in Latin-1: the famous center dot ;-) I've tweaked
my keyboard setup to have it handy...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    32 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From da@ski.org  Mon Nov 29 23:28:28 1999
From: da@ski.org (David Ascher)
Date: Mon, 29 Nov 1999 15:28:28 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <E11sWVf-00065b-00@lsls4p>
Message-ID: <Pine.WNT.4.04.9911291127240.263-100000@rigoletto.ski.org>

On Mon, 29 Nov 1999, Edward Welbourne wrote:

> As I remember, gendoc used *emphasis* and **strong**, which does
> adequately (and may be in use by some of us), though I can see a case
> against the doubling.

Fine with me.

> to be honest, equating it to PRE I don't like; Code deserves to be a
> keyword which switches the context-sensitive parsing to expecting python
> code...

All good points, and fine with me.  

--david


From Daniel.Larsson@telia.com  Mon Nov 29 23:48:48 1999
From: Daniel.Larsson@telia.com (Daniel Larsson)
Date: Tue, 30 Nov 1999 00:48:48 +0100
Subject: [Doc-SIG] docstring grammar
References: <Pine.WNT.4.04.9911291109070.263-100000@rigoletto.ski.org> <002d01bf3aa2$23e9c8a0$f25728a1@UNITEDSPACEALLIANCE.COM>
Message-ID: <002901bf3ac4$4a3a50c0$3a1e54c3@danilo>

Hmm, I think I had an old email address on the list, and since the latest
employment
haven't enabled me to do much Python programming :-(, I sort of forgot to
fix the
problem. I'll fix that. There is an archive for the list, right? So I can
catch up
on what you all are talking about.

Daniel Larsson

----- Original Message -----
From: Robin Friedrich <friedrich@pythonpros.com>
To: David Ascher <da@ski.org>
Cc: <doc-sig@python.org>; <Daniel.Larsson@telia.com>
Sent: Monday, November 29, 1999 8:44 PM
Subject: Re: [Doc-SIG] docstring grammar


> http://www.python.org/sigs/doc-sig/status.html
>
> Contains an old summary of the formatting rules for Structured Text use in
> doc strings.
>
> Oddly Dan's subscription to this list is disabled, probably from an old
> address. The latest address I have for him is Daniel.Larsson@telia.com
>
> ----- Original Message -----
> From: David Ascher <da@ski.org>
> To: Robin Friedrich <friedrich@pythonpros.com>
> Cc: <doc-sig@python.org>; Daniel Larsson
> <Daniel.Larsson@vasteras.mail.telia.com>
> Sent: Monday, November 29, 1999 1:22 PM
> Subject: Re: [Doc-SIG] docstring grammar
>
>
> > On Mon, 29 Nov 1999, Robin Friedrich wrote:
> >
> > > Some people on this list should remember the development days of
gendoc
> and
> > > it's cleaner successor pythondoc written by Dan Larsson (gosh I hope
I'm
> not
> > > the only one)!
> >
> > Yes, I remember it.  Thanks for the reminder and pointer, Robin!
> >
> > > We pleaded back then for ideas/opinions/hacked code to help improve
the
> > > working code Dan wrote but got little response.
> >
> > FWIW, I think that one problem gendoc/pythondoc had in terms of strategy
> > was that it was billed as a 'tool'.  I think that if we establish a
> > 'blessed standard' then any standard-compliant tool has a guaranteed
user
> > base, and has a far greater likelihood of long-term success.  Also, once
> > the format is documented, then folks who don't like gendoc or for
whatever
> > reason want to do it 'their own way' can still do it in a compatible
way.
> >
> > I'll start digging in gendoc to see the differences between its format
and
> > what I've been discussing.  I'd love to leverage it to build a reference
> > implementation.
> >
> > Dan Larsson, are you reading this discussion?  We could use your
> > experience here!
> >
> > --david
> >
>
>


From fdrake@acm.org  Tue Nov 30 00:09:36 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 29 Nov 1999 19:09:36 -0500 (EST)
Subject: [Doc-SIG] Re: SMTP?
In-Reply-To: <00cc01bf3739$a1d83d30$f29b12c2@secret.pythonware.com>
References: <19991124171120.1623.qmail@hotmail.com>
 <19991124202751.A5717@stopcontact.palga.uucp>
 <00cc01bf3739$a1d83d30$f29b12c2@secret.pythonware.com>
Message-ID: <14403.5568.677456.595721@weyr.cnri.reston.va.us>

Fredrik Lundh writes:
 > I once contributed a (IMHO) better example, which
 > 1) actually imported all modules that were used in
 > the example, 2) used more reasonable python con-
 > structs (raw_input instead of that prompt hack, etc),
 > and 3) showed how to add the basic headers to the
 > message body.
 > 
 > as far as I can tell, only (1) made it into the docs...

  I don't recall the specific patch (though there have been patches to 
that example), so I probably just missed it.  I've just checked in
some changes (based on your comments here) to the maintenance branch,
so the next version should be better.
  Sorry for not getting you patch integrated!


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From mhammond@skippinet.com.au  Tue Nov 30 03:29:44 1999
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Tue, 30 Nov 1999 14:29:44 +1100
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <Pine.WNT.4.04.9911290944390.263-100000@rigoletto.ski.org>
Message-ID: <006b01bf3ae3$25b6fe00$0501a8c0@bobcat>

> After a little thought, I'm tempted to remove the ::
> requirement as well.

I agree this would be a good thing.  I originally intended to reply in
context to all the good suggestions - however, I dont look like
finding time until after Christmas :-(

So here is my 2c worth, mainly echoing comments from others:

Drop the absolute requirement for the whitespace, especially with
bulleted lists.  People will generally not be editing these strings in
a word-processor, so will have control over the line breaks.

Thus:
* Any line starting with a word followed by a colon can be considered
a keyword.  If you dont want this, just make sure its not the first
word on the line.
* A star or dash starting a line can be considered a new list item.
Again, if it is truly a hyphen or whatever else, just adjust your line
wrap slightly so it is no longer the first word.

Other random thoughts:
* The [blah] notation is good, but needs to be well defined.  eg,
"[module.function]" when used in the context of a package should use
the same "module scoping" that Python itself uses.  However, the use
of brackets may conflict with people who use inline code (rather than
an example "block" - maybe something like "@" could be used?
@module.function@ would be reasonable.

* IMO, importing the module to extract this information is fine.  For
the 1% of cases where it is not and the author of the module needs to
use the tool, we could offer a hack - eg "sys.doc_building" will be
defined when the tool is running, so could fine tune their code
appropriately.  For the vast majority of cases, I guess that importing
would be just fine and make the tool simpler, thereby giving more
chance of it one day existing :-)  Indeed, do it the simple way, and
the first person who needs the parse-only option can help code it :-)

* Example/test code should be clearly identifiable.  Tim Peters
docstring tester could also be hacked to work with with format.
Further, it should be possible to have lots of discrete sample code,
each with their own discussion - eg:
"""
The following code shows how to do this:
Example:
  def foo():
    etc

/Example:
The following code shows how to do that:
Example:
  def bar():
    etc

As a final note:  The tool should be written with distinct "generate"
and "collate" phases, simply to resolve the cross-references.  It is
unreasonable to expect that all cross-references will be capable of
being resolved in a single pass.  Note sure exactly what this means
from an implementation POV, but it is important.

Thats about it.  I really like this, and feel it can is both powerful
and extensible enough to grow with us.  All we need now is the tool
:-)

Mark.


From da@ski.org  Tue Nov 30 07:04:18 1999
From: da@ski.org (David Ascher)
Date: Mon, 29 Nov 1999 23:04:18 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <006b01bf3ae3$25b6fe00$0501a8c0@bobcat>
Message-ID: <Pine.WNT.4.05.9911292257290.197-100000@david.ski.org>

On Tue, 30 Nov 1999, Mark Hammond wrote:

> * The [blah] notation is good, but needs to be well defined.  eg,
> "[module.function]" when used in the context of a package should use
> the same "module scoping" that Python itself uses.  However, the use
> of brackets may conflict with people who use inline code (rather than
> an example "block" - maybe something like "@" could be used?
> @module.function@ would be reasonable.

I personally would prefer to keep [] for references and introduce @..@ (or
some other delimiter) for inline code, mostly because [] is so common in
journals as a way of indicating bibliographic references.  I do *not* like
StructuredText's use of quotes to do inline code markup.

> * IMO, importing the module to extract this information is fine.  For
> the 1% of cases where it is not and the author of the module needs to
> use the tool, we could offer a hack - eg "sys.doc_building" will be
> defined when the tool is running, so could fine tune their code
> appropriately.  For the vast majority of cases, I guess that importing
> would be just fine and make the tool simpler, thereby giving more
> chance of it one day existing :-)  Indeed, do it the simple way, and
> the first person who needs the parse-only option can help code it :-)

I see.  So the workaround for those scripts which can't be imported is to
start them with:

import sys; if sys.doc_building: sys.exit()

Not too bad.

> * Example/test code should be clearly identifiable.  Tim Peters
> docstring tester could also be hacked to work with with format.

I need to go back and look at Tim's code again.

> Further, it should be possible to have lots of discrete sample code,
> each with their own discussion - eg:
> """
> The following code shows how to do this:
> Example:
>   def foo():
>     etc
> 
> /Example:
> The following code shows how to do that:
> Example:
>   def bar():
>     etc

That would be written (with the current proposal):

  The following code shows how to do this:
    Example:
      def foo():
        etc
 
  The following code shows how to do that:
    Example:
      def bar():
        etc

Is that ok w/ you?

--david


From tim_one@email.msn.com  Tue Nov 30 07:50:59 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 30 Nov 1999 02:50:59 -0500
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <Pine.WNT.4.05.9911292257290.197-100000@david.ski.org>
Message-ID: <000b01bf3b07$a465ad40$c92d153f@tim>

[MarkH]
> * Example/test code should be clearly identifiable.  Tim Peters
> docstring tester could also be hacked to work with with format.

[DavidA]
> I need to go back and look at Tim's code again.

I already did <wink>.  Tim's code looks for:

   ^\s*>>>

and then sucks up everything following until the next all-whitespace line or
end of docstring (whichever comes first).

That is, I figured the contents of an interactive shell window didn't need
any markup beyond the leading PS1 Python already sticks there.  Given that
doctest.py is meant to be usable with near-zero effort, it wouldn't do to
require more markup than that.

Luckily, it almost fits your definition of a paragraph already.  It
shouldn't be any real effort to declare that ">>>" introduces a
structureless code paragraph extending until the next all-whitespace etc --
given that it's a format for Python docstrings, Python's own output deserves
some special treatment <wink>.

As to whether doctest should be fiddled to try to interpret some other form
of markup too, I don't think so.  The markup it inherits from the Python
shell is both sufficient and pleasant for its users.  Any other kind of
embedded sample code almost certainly isn't intended to be auto-verified, so
doctest *should* ignore it.

Nothing you're likely to do with docstrings is going to create problems for
doctest, so the only question is whether doctest's conventions create
problems for docstring markup.  I think they do now, but "shouldn't":
anyone pasting in an interactive session, whether for use with doctest or
for some other purpose, is going to want it treated as a code block.

full-speed-ahead-ly y'rs  - tim


From mhammond@skippinet.com.au  Tue Nov 30 08:38:25 1999
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Tue, 30 Nov 1999 19:38:25 +1100
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <Pine.WNT.4.05.9911292257290.197-100000@david.ski.org>
Message-ID: <007b01bf3b0e$45cdcf40$0501a8c0@bobcat>

> I personally would prefer to keep [] for references and
> introduce @..@ (or
> some other delimiter) for inline code, mostly because [] is
> so common in
> journals as a way of indicating bibliographic references.  I

Fair enough.

> I see.  So the workaround for those scripts which can't be
> imported is to
> start them with:
>
> import sys; if sys.doc_building: sys.exit()
>
> Not too bad.

I more had in mind:

if sys.doc_building:
  # Normally critical we do this.
  dont_do_something_really_expensive()

We dont need to execute the bulk of the code, just import the module
and get a few of the symbols.

> That would be written (with the current proposal):
>
>   The following code shows how to do this:
>     Example:
>       def foo():
>         etc
>
>   The following code shows how to do that:
>     Example:
>       def bar():
>         etc
>
> Is that ok w/ you?

Perfect.

Mark.


From mhammond@skippinet.com.au  Tue Nov 30 08:45:18 1999
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Tue, 30 Nov 1999 19:45:18 +1100
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <007b01bf3b0e$45cdcf40$0501a8c0@bobcat>
Message-ID: <007c01bf3b0f$4339c170$0501a8c0@bobcat>

> I more had in mind:
> 
> if sys.doc_building:
>   # Normally critical we do this.
>   dont_do_something_really_expensive()

Sheesh - I obviously meant:

if not sys.doc_building:
  do_something_really_expensive()

But Im sure you got my drift :-)

Mark.


From mal@lemburg.com  Mon Nov 29 21:59:23 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Nov 1999 22:59:23 +0100
Subject: [Doc-SIG] docstring grammar
References: <Pine.WNT.4.04.9911291033580.263-100000@rigoletto.ski.org>
Message-ID: <3842F73B.CB64BC8B@lemburg.com>

David Ascher wrote:
> 
> > Some notes on the proposal:
> >
> > � Mentioning the function/method signature is ok, but sometimes
> >   not needed since e.g. the byte code has enough information to
> >   deduce the signature from it. This is not true for builtin
> >   function which is probably the reason for all builtin doc
> >   strings to include the signature.
> 
> Right.  It's not true for builtins, extension module functions, and I'm
> not sure how easy it is for JPython code.  I have no problem with somehow
> making it easy to omit those in cases where the information can be
> obtained through the bytecode.

Perhaps we could use a convention: if the first line starts
with a Python identifier followed by '(' and the identifier
matches the name of the doc string owning object (function or
method), then no byte code lookup is done. Otherwise such
a lookup causes a new first line to be prepended to the
processed doc string (with '-> ?' return value).

This should cover most cases.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    32 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tony@lsl.co.uk  Tue Nov 30 10:31:43 1999
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 30 Nov 1999 10:31:43 -0000
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <Pine.WNT.4.04.9911290944390.263-100000@rigoletto.ski.org>
Message-ID: <001301bf3b1e$1875bc50$f0c809c0@lslp7o.lsl.co.uk>

All of the following are minor nit-pickings, because it all looks VERY GOOD.
(Personally, I'm not too worried about the tool as-such, I just want the
grammar defined so I can use it!).

David Ascher wrote:
> I forgot two markups:  *this* is bold and _this_ is italic.  Bold and
> italic markups must begin and end within a paragraph (I'd say 'within a
> sentence' but I don't want to complicate the parser with a sentence type).
> No space allowed between *'s and _'s and their contents.

And I hope it's also possible to nest them arbitrarily, with some "sensible"
effect (yes, this *is* useful in english text, and I would not want to lose
it in documentation!). [Technically, that's a viewer problem, but I want the
grammar to *say* this can be done, so the software writers have an onus on
them to cope with it.]

Marc-Andre Lemburg wrote:
> I'd suggest using '^ *[a-zA-Z_]+[a-zA-Z_0-9]*: *' as RE for
> keywords, i.e. keywords are Python identifiers immediatly followed
> by a colon starting a line of a doc string. That should avoid
> most complications, I guess.

Sounds sensible to me - the advantages outweigh the disadvantages.

On Tim Peters' test texts - I think this is actually an important enough
idea that it might warrant its own keyword - perhaps "TestScript" (no, I
know that's clumsy) - thus giving subliminal encouragement to the concept
(hmm - must use it someday, he said guiltily). This would also allow us to
distinguish odd chunks of code which are NOT test scripts (a new ability,
since at the moment the tester will try to use all >>> text?), which I think
could sometimes be useful...

David Ascher wrote:
> How about another keyword?
>
>  List:
>     * foo
>     * bar
>     * spam

I would vote against that, firstly on the grounds that it doesn't read well,
and secondly that it is probably the sort of thing that people wouldn't do
(!). As with what others think, I believe we can hack lists without the
keyword (is this now the consensus?).

In another message, David continued:
> I propose that part of the definition of a keyword is (along with any
> special parsing rules) whether it can be duplicated in a docstring.

Hmm - then I think we're going to need some serious support in "The Standard
Editors" to give a hint about whether something can be included more than
once, since I have a sneaky feeling we're getting quite a lot of keywords
(is it about 7 things that humans remember easily?). On the other hand,
modulo the clever peoples' time, I rate that as "not a problem".

NB: how picky is the tool going to be about getting the indentation exactly
right? I'm not fussed by it being very picky, but I know I'm odd that way.

David Hammond votes for doing lists by detecting the bullets (good), but I'd
like to reserve more than two characters (hyphen and asterisk are OK, but I
do sometimes use 3 level lists, and would like another one - on the other
hand, I'm not sure what other than @ and he wants that for something else...
hmm - if we're not worried by hyphen confusing us with negative numbers,
maybe plus would be sensible).

I also tend to agree with Davids Hammond and Ascher that [ and ] are very
valuable AS TEXT. The use of @..@ is visually very obvious to me, which is
presumably a good thing in context, so I also vote for that (gosh, I've just
voted positively for something delimited by the same character at start and
end - obviously the start of the slippery road to hell).

Whilst I don't know owt about parsing (well, more precisely, parse trees
scare me), I don't see any of the proposals so far as giving any great
problems with extracting information from the text.

David (Ascher) - is it time to re-release your initial "docstring grammar"
email with the comments you're happy with edited in? I *really* don't have
time to do it, or I already would...

Tibs
--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.demon.co.uk/
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)
[I've read it twice. I've thought it over. I'm sending it anyway.]


From Edward Welbourne <eddyw@lsl.co.uk>  Tue Nov 30 12:44:47 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Tue, 30 Nov 1999 12:44:47 +0000
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <Pine.WNT.4.04.9911291050200.263-100000@rigoletto.ski.org>
References: <3842C5F4.91B2C4BB@lemburg.com>
 <Pine.WNT.4.04.9911291050200.263-100000@rigoletto.ski.org>
Message-ID: <E11smeF-0000fC-00@lsls4p>

David Ascher wrote:
> I propose that part of the definition of a keyword is (along with any
> special parsing rules) whether it can be duplicated in a docstring.

FAPP we can approach both this and the `context-sensitive' stuff from
the same point of view as SGML: precisely because

Blah:
    something legitimate
    in a Blah block

maps directly to

<BLAH>
something legitimate
within a BLAH
</BLAH>

so all the kinds of rule that a DTD could have imposed on BLAH are
sensible things to impose on Blah.  In particular, rather than `whether
it can be duplicated in a docstring' we have a nested tree structure in
our hands, so we can ask whether it can be duplicated as a child of its
parent.  Suppose Date to be unique:

Author: David Ascher
Release:
    Date: 1999/Nov/28
    Name: proto-post-gendoc:0.2
    Media: e-mail
Bugs:
    Report:
        Date: 1999/Nov/29

        As initially specified, the denotation for italic conflicts with
        python identifiers where these genuinely start and end in
        underscore.

        Status: resolved, adopting gendoc's approach

    Report:
        Date: ...

in which Date is `unique' but shows up many times in one doc-string.  I
would suggest that a tag is either unique in all contexts that allow for
it, or in none (so we don't have ickiness in which *some* tags allow
several Date subordinates - that kind of stuff makes it harder for folk
to remember what's unique and what isn't).  The right layer

A note on tags: we seem to be headed for `python identifier followed by
a colon'.  I'd like to argue for RFC 822 headers - that is,
specifically, to allow hyphens, so as to allow

Bugs:
    Reports-to: doc-sig@python.org
    Report: ... as above ...

and, indeed, to change Bugs: to Known-bugs:

Of course we could use _, but hyphen comes more naturally to text and
the parser for our keywords (unlike that for python identifiers) doesn't
have to worry about subtraction as `something we might be doing here' to
confuse with recognising the keyword.


For the sake of a coarse reprise of where I think we are:

Within docstrings, paragraphs, `text fields', descriptive and bulleted
lists are marked up using pretty much what gendoc used, though we seem
to be making some tweaks.  The main addition of David's proposal is a
structured data format entirely analogous to a *ML's begin-end
structure, but transformed to indent/dedent format - in exactly the same
way that one transforms the begin-end structure of C or Pascal into
python code.  This gets us all the desiderata that XML would provide,
but it does it in a pythonic format.

The typical block allows (depending on the keyword which introduced it)
an assortment of keywords to be used to introduce sub-blocks; it may
also allow paragraphs and/or lists within it.  The docstring is a block
which is willing to hold all `outer' structural groups (i.e. top-level
keywords, with their blocks, and paragraphs).  A paragraph is a block
which (possibly along with the blocks started by some keywords) may have
sub-blocks which are list items.

We can effectively write the rules for all this as a DTD and parse it
into a form which can be manipulated *as if* it had been obtained by
parsing a lump of XML - in particular, it should be trivial to perform
XSL-ish tree transformations to convert it to whatever DTD The Manual
wants as its input; while leaving ample scope for the inventive
toolwright to perform sophisticated information massaging on docstrings,
and not obliging us to use all that ugly XML taggery in the source.


We need a moderately short list (of order a dozen) of `top-level' tags:
subordinate to each we may introduce a few others (context sensitivity)
but simplicity demands vocabulary restraint and re-use.  The top level
seems to run to:

In all docstrings:
   Author(s), Release, Contributors
   Example(s), Test-script, Code
   Warning

In docstrings of callables:
   Argument(s), Return, Raises

In docstrings of classes:
   Supports/Implements/Mimics... (one synonym)
   Subclassing (for folk using this class as a base - what to override)
   Attributes, Methods (each supporting Private and Public as subordinates)

In docstrings of modules:
   Contents

so 7 universally-applicable keywords, (up to) the rest of a dozen in
each of the specific contexts for docstrings.  I would reckon we can
keep to about another dozen keywords spread around as subordinates of
the above (Date, Private, Public, Expect (for Test-script), Required &
Optional (for arguments), ...).


On test-scripts (in the manner of Tim Peters) we may not need a
Test-script keyword at all: simply using >>> is how the tool recognises
it, and there's nothing to stop the docstring parser recognising this as
a special indent mark that transforms to target XML *as if* it had come
from a block introduced by Test-script:.

	Eddy.


From Manuel Gutierrez Algaba <irmina@ctv.es>  Tue Nov 30 15:31:09 1999
From: Manuel Gutierrez Algaba <irmina@ctv.es> (Manuel Gutierrez Algaba)
Date: Tue, 30 Nov 1999 15:31:09 +0000 (GMT)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <E11sUsh-0005yR-00@lsls4p>
Message-ID: <Pine.LNX.3.95.991129233455.810A-100000@localhost>

On Mon, 29 Nov 1999, Edward Welbourne wrote:

> Manuel: if David includes `Keyword::' in his bits and pieces, would
> 
> Keyword::
>      indexing
>      keyword
>      data retrieval
>      searching
> 
> (within a doc-string) contain the information you've been wanting to
> take out of your
> 
> \indexaboutindexing
> \indexaboutkeyword
> \indexretrieval
> \indexsearching

No. They're completely different things. In fact there's consensus,
David insists on "bullets" for args and javadoc-ish things, and 
I insist on Encyclopedia-Higher-Level-python-stuff. My system
, currently, is for marking/sorting "general" info.

> I know you have bits that define an indexing command that expands to
> several indexing commands, which this lacks: but could the same effect
> be arrived at by turning your set of indexing command definitions into
> an `expert system' that expands some keywords ?

Yes, expert system are fine, but the greatest difficulty with my 
proposal is that people *must* input hundreds of attributed info if
we want to anything useful. Expert system is phase 2, when we
have to group indexes and extract info.

Regards/Saludos
Manolo
www.ctv.es/USERS/irmina    /TeEncontreX.html   /texpython.htm
 /SantisimaInquisicion/index.html 

  Everything in this book may be wrong. -- Messiah's Handbook : Reminders for the Advanced Soul


From Edward Welbourne <eddyw@lsl.co.uk>  Tue Nov 30 14:35:21 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Tue, 30 Nov 1999 14:35:21 +0000
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <006b01bf3ae3$25b6fe00$0501a8c0@bobcat>
References: <Pine.WNT.4.04.9911290944390.263-100000@rigoletto.ski.org>
 <006b01bf3ae3$25b6fe00$0501a8c0@bobcat>
Message-ID: <E11soNF-0001fY-00@lsls4p>

> Thus:
> * Any line starting with a word followed by a colon can be considered
> a keyword.  If you dont want this, just make sure its not the first
> word on the line.

Not happy.  A paragraph of text which precedes an example may be relied
upon to end in `for example:', in which the last contiguous block of
non-space characters is of length 8; if I modify an earlier part of the
paragraph, I'm going to ask my authoring tool (python-mode.el) to
reformat the paragraph, without necessarily being aware of a gotcha
waiting for me at the paragraph's end; my margins will be within 72
characters of one another, giving a roughly 1 in 9 chance that
`example:' ends up being alone on the last line ... gotcha.

A cure for this would just be to do keyword-recognition case
sensitively, and Capitalise keywords; otherwise, we have to insist on
either a dedent or a blank line preceding any keyword.  Which offends
folk worse: case sensitivity or needing a dedent/vspace ?


> * A star or dash starting a line can be considered a new list item.
> Again, if it is truly a hyphen or whatever else, just adjust your line
> wrap slightly so it is no longer the first word.

Alternatively, all lists use the same `item-introducer' character and
follow it with an optional character indicating what bullet to use.
Thus one might have (taking ~ as the introducer for the illustration)

  ~ outermost list, first item
  ~ outer second which may contain a subordinate
    ~ which is dedented so it can use the same introducer without
      confusion
    ~ and output formatters can chose different symbols
      in place of the star for successive nesting layers
    ~ by the way, should further lines line up with the text or the
      bullet ?  my reckoning is with the text ...
  ~ outer third, whose subordinate might want Roman numerals
    ~i so it indicates them thus
    ~i and can chose to leave the engine to sort out numbering
    ~iii but can effectively assert that one item (referred to
         elsewhere) has a particular number
    ~i without having to mention numbers for the rest
    ~i and of course 
       ~1 we can use the other numbering styles
       ~2 including alphabetic, upper or lower, using ~A or ~a.
       ~1 with use of first in series taken as `work out right number'
       ~7 but I think the tool should complain if you get later
          positions wrong: it's an assertion, and it indicates that this
          item is going to be referred to from other text as item 7 - I
          need to be told I got it wrong !  Obviously I've deleted a few
          items before this one without realising what's happening below ...
  ~ outer fourth
    ~o must the bullets in a given list all match ?
       ~. should stand for mid-dot, and star is likewise easy using *
    ~o I think so, anyway
      ~- dash is obvious and now unambiguous, as are + and =
    ~o mind you, o requires care: if it's the first item in a list, that
       list is going to use o as its bullet; but if it appears in a list
       which began with a ~a then we have to read it as item fifteen.
      ~ and if we're insisting on all items in a list having the same
        bullet, does it make sense to allow items after the first to
        just use an unadorned star meaning re-use of first item's
        symbol, thus saving us lots of editing when we want to change
        the symbol in use by a list, or shuffle an item from a sub-list
        out into its parent list (or etc.)
      ~ of course, ~ needn't be the bullet-introducer, we could use
        pretty much any punctuator as long as it doesn't obviously
        clash; candidate egs: #, @, $, %, &, * and even |
  ~ outer fifth
    ~ as for descriptive lists, I'd go with the old gendoc form, which

      uses double dash -- which just feels so natural, but

      needs vspace -- to separate items, given that -- might be used
      within an item on a later-than-first line.  I can live with this.

> Other random thoughts:
> * The [blah] notation is good, but needs to be well defined.  eg,
> "[module.function]" when used in the context of a package should use
> the same "module scoping" that Python itself uses.

The thing that saves [this] from being problematic is that the format in
which it was introduced presumed that one was going to use a brief
mnemonic as [this] word and end the docstring with a chunk which
explains the cross-references (new keyword: Xrefs ?) and, in particular,
tells the doc-string-reader which [tokens] actually have a translation,
the rest being left as typed; thus, if this paragraph appeared in a
docstring which says how to translate [this] (giving an xref and -
optionally - a text to use (default `this') in place of [this]), the
digested form would duly replace [this] but leave [tokens] as it is.

To further simplify life, I'd understood the [this] keys that are
translatable to insist on [nowhitespace] to save the parser most of its
`this might be an xref' pending decisions - which is why the Xrefs
section needs to at least have the option of specifying the text to be
used in place of [this] as well as the Xref to point it at.  What we're
doing is citation, which is widely done with [].

No need for [this] to be a [module.function] or anything like - the
Xrefs section provides the translation.

Xrefs:
   [gendoc] http://www.python.org/contrib/gendoc/
   [this] http://www.python.org/lists/doc-sig/hideous?with=data&as=you+will The present message
   [copy] string.copy the standard string copy function
   [etc] location sub sti tute

[sorry, all exhibited xrefs are bogus - illustrative only]
I'm sure that's only a minor paraphrase of a spec I saw a while ago on
this list ...

Of course, Xrefs might better be called Bibliography.

We can use as `location' some pythonic reference that can be resolved in
the ways that the suggested module.function semantics point to: indeed,
I would take this as what to try first, falling back on recognising
other stuff as URLs and similar.

> ... However, the use
> of brackets may conflict with people who use inline code (rather than
> an example "block" - maybe something like "@" could be used?
> @module.function@ would be reasonable.

With the above, can we evade this ?
The fact that [citations] are so widely used argues for the [form]; and
the fact that [anything with space in it] isn't a citation should make
all the `ordinary text' and `python denotations' [usages] unproblematic,
while leaving untranslated ones as [literal] uses of [ and ].  If
nothing else, I find my eye latches onto [cite] better than @cite@ ...
and bear in mind that @ has some other magic uses,

parser error - unclosed citation at line 137:
      Sender: eddyw@lsl.co.uk

All told, we seem to have a fairly good spec ... save for some
nitpickery ;^>

Tibs said:
> David (Ascher) - is it time to re-release your initial "docstring
> grammar"
and I confess that's something I'd like to see too.
After all, we have to have someone to play Gdo ...

	Eddy.


From Edward Welbourne <eddyw@lsl.co.uk>  Tue Nov 30 15:32:00 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Tue, 30 Nov 1999 15:32:00 +0000
Subject: [Doc-SIG] docstring grammar (erratum)
In-Reply-To: <E11soNF-0001fY-00@lsls4p>
References: <Pine.WNT.4.04.9911290944390.263-100000@rigoletto.ski.org>
 <006b01bf3ae3$25b6fe00$0501a8c0@bobcat> <E11soNF-0001fY-00@lsls4p>
Message-ID: <E11spG4-0001u0-00@lsls4p>

I said:
    ~ and output formatters can chose different symbols
      in place of the star for successive nesting layers
and
      ~ and if we're insisting on all items in a list having the same
        bullet, does it make sense to allow items after the first to
        just use an unadorned star meaning re-use of first item's
        symbol, thus saving us lots of editing when we want to change
        the symbol in use by a list, or shuffle an item from a sub-list
        out into its parent list (or etc.)

but `unadorned star' should be `unadorned twiddle' - I missed a
conversion after being persuaded that *'s font role prohibits its use
as, for instance, *o or *1, which would match `begin italic': hence the
use of ~ and remarks about other candidates.  Likewise, in the first,
the presumption was that * is the default symbol, but I don't imagine
we'd be using ~ as a bullet much (well, we could), so that snippet
should have vanished.  The output formatters chose symbols as
appropriate: the parser just identifies the list structure and which
bits are subordinate to which others.

	Eddy.


From mal@lemburg.com  Tue Nov 30 16:58:36 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 30 Nov 1999 17:58:36 +0100
Subject: [Doc-SIG] docstring grammar
References: <Pine.WNT.4.04.9911290944390.263-100000@rigoletto.ski.org>
 <006b01bf3ae3$25b6fe00$0501a8c0@bobcat> <E11soNF-0001fY-00@lsls4p>
Message-ID: <3844023C.41B12CDD@lemburg.com>

Edward Welbourne wrote:
> 
> > Thus:
> > * Any line starting with a word followed by a colon can be considered
> > a keyword.  If you dont want this, just make sure its not the first
> > word on the line.
> 
> Not happy.  A paragraph of text which precedes an example may be relied
> upon to end in `for example:', in which the last contiguous block of
> non-space characters is of length 8; if I modify an earlier part of the
> paragraph, I'm going to ask my authoring tool (python-mode.el) to
> reformat the paragraph, without necessarily being aware of a gotcha
> waiting for me at the paragraph's end; my margins will be within 72
> characters of one another, giving a roughly 1 in 9 chance that
> `example:' ends up being alone on the last line ... gotcha.
> 
> A cure for this would just be to do keyword-recognition case
> sensitively, and Capitalise keywords; otherwise, we have to insist on
> either a dedent or a blank line preceding any keyword.  Which offends
> folk worse: case sensitivity or needing a dedent/vspace ?

Why not just raise an exception ? I don't think that the
usage of "some text:" is common in doc strings except for
maybe examples which should then adapted to use the new
"Example:" keyword.

Here's an example docstring... the format looks pretty nice,
IMHO.

"""
foo(bar,rab,oof) -> integer -- single line desription

Longer description spanning
multiple lines

Arguments:
    bar -- some string
    rab -- another string
    oof -- an integer       

Returns:
    42 in most cases

History:
    19991130 MAL -- Added oof argument
    19991101 MAL -- Created

"""

Not sure if this is already somewhere in the proposal, but
I would like to see '--' as indicator of a single line
text block. This would be useful in vertically compressing
the docstrings somewhat (and it already being used in the
signature line for such a purpose).
 
> > * A star or dash starting a line can be considered a new list item.
> > Again, if it is truly a hyphen or whatever else, just adjust your line
> > wrap slightly so it is no longer the first word.
> 
> Alternatively, all lists use the same `item-introducer' character and
> follow it with an optional character indicating what bullet to use.
> Thus one might have (taking ~ as the introducer for the illustration)
> 
> ...

Let's leave this to some list parser (are we starting to head
for NP-completeness again ;-).

> > Other random thoughts:
> > * The [blah] notation is good, but needs to be well defined.  eg,
> > "[module.function]" when used in the context of a package should use
> > the same "module scoping" that Python itself uses.

Right. It should ideally perform the same lookup as Python would
in the global namespace. The resulting object could then either
be handled recursively by the doc tool or simply stored by reference
for later use (e.g. via the file name of a module or the id of an
object).
 
> The thing that saves [this] from being problematic is that the format in
> which it was introduced presumed that one was going to use a brief
> mnemonic as [this] word and end the docstring with a chunk which
> explains the cross-references (new keyword: Xrefs ?) and, in particular,
> tells the doc-string-reader which [tokens] actually have a translation,
> the rest being left as typed; thus, if this paragraph appeared in a
> docstring which says how to translate [this] (giving an xref and -
> optionally - a text to use (default `this') in place of [this]), the
> digested form would duly replace [this] but leave [tokens] as it is.
> 
> To further simplify life, I'd understood the [this] keys that are
> translatable to insist on [nowhitespace] to save the parser most of its
> `this might be an xref' pending decisions - which is why the Xrefs
> section needs to at least have the option of specifying the text to be
> used in place of [this] as well as the Xref to point it at.  What we're
> doing is citation, which is widely done with [].
> 
> No need for [this] to be a [module.function] or anything like - the
> Xrefs section provides the translation.
> 
> Xrefs:
>    [gendoc] http://www.python.org/contrib/gendoc/
>    [this] http://www.python.org/lists/doc-sig/hideous?with=data&as=you+will The present message
>    [copy] string.copy the standard string copy function
>    [etc] location sub sti tute
> 
> [sorry, all exhibited xrefs are bogus - illustrative only]
> I'm sure that's only a minor paraphrase of a spec I saw a while ago on
> this list ...
> 
> Of course, Xrefs might better be called Bibliography.

Or perhaps "References:" as in David's proposal ?!

> We can use as `location' some pythonic reference that can be resolved in
> the ways that the suggested module.function semantics point to: indeed,
> I would take this as what to try first, falling back on recognising
> other stuff as URLs and similar.
> 
> > ... However, the use
> > of brackets may conflict with people who use inline code (rather than
> > an example "block" - maybe something like "@" could be used?
> > @module.function@ would be reasonable.
> 
> With the above, can we evade this ?
> The fact that [citations] are so widely used argues for the [form]; and
> the fact that [anything with space in it] isn't a citation should make
> all the `ordinary text' and `python denotations' [usages] unproblematic,
> while leaving untranslated ones as [literal] uses of [ and ].  If
> nothing else, I find my eye latches onto [cite] better than @cite@ ...
> and bear in mind that @ has some other magic uses,
> 
> parser error - unclosed citation at line 137:
>       Sender: eddyw@lsl.co.uk
> 
> All told, we seem to have a fairly good spec ... save for some
> nitpickery ;^>

Since [] is only used for lists in Python, we could
define the RE '\[[a-zA-Z0-9_.]+\]' for our purposes and
raise an exception in case the enclosed reference cannot
be mapped to a symbol in the global namespace (note: no
whitespace, no commas) which either evaluates to a function,
method, module or reference object.

Doc strings like "...use [None]*10 as argument..." will fail,
but are easily avoided by inserting some extra whitespace, e.g.
"...use [ None ] * 10 as argument...".

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    31 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From da@ski.org  Tue Nov 30 17:27:43 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 09:27:43 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <3844023C.41B12CDD@lemburg.com>
Message-ID: <Pine.WNT.4.04.9911300919130.217-100000@rigoletto.ski.org>

Mark Hammond:

> Thus: * Any line starting with a word followed by a colon can be
> considered a keyword.  If you dont want this, just make sure its not
> the first word on the line.

I agree with Edward on this one -- this is too fragile.

I consider the whitespace issue to be real only in the context of lists,
and I think that gendoc has shown that it's solvable within the context of
lists.  I stand by the keyword notation I presented:  either

   Keyword:
     text block
     spanning one or more lines 

or

   Keyword: one-line block

as long as they are both in separate paragraphs.

> Not sure if this is already somewhere in the proposal, but
> I would like to see '--' as indicator of a single line
> text block. This would be useful in vertically compressing
> the docstrings somewhat (and it already being used in the
> signature line for such a purpose).

Isn't that just redundant with the : notation?  Note that I don't mind a
little redundancy, but it's unpythonic.  

> > > * A star or dash starting a line can be considered a new list item.
> > > Again, if it is truly a hyphen or whatever else, just adjust your line
> > > wrap slightly so it is no longer the first word.
> > 
> > Alternatively, all lists use the same `item-introducer' character and
> > follow it with an optional character indicating what bullet to use.
> > Thus one might have (taking ~ as the introducer for the illustration)
> > 
> > ...
> 
> Let's leave this to some list parser (are we starting to head
> for NP-completeness again ;-).

Absolutely!

Mark:
> Other random thoughts:
> * The [blah] notation is good, but needs to be well defined.  eg,

MAL:

> Right. It should ideally perform the same lookup as Python would
> in the global namespace. The resulting object could then either
> be handled recursively by the doc tool or simply stored by reference
> for later use (e.g. via the file name of a module or the id of an
> object).

Edward:
> The thing that saves [this] from being problematic is that the format in
> which it was introduced presumed that one was going to use a brief
> mnemonic as [this] word and end the docstring with a chunk which
> explains the cross-references (new keyword: Xrefs ?) 

I think that both are needed.  I believe that the namespaces looked up
should be:
  1) the local namespace of the docstring -- i.e., the set of keywords
     defined in the "References" keyword block in the current docstring.
  2) the global namespace of the docstrings -- i.e. the set of keywords 
     defined in the "References" keyword block in the MODULE docstring.
  3) The global Python namespace for that module
  4) Some namespace corresponding to builtins & unimported modules, yet
     ill-defined.

The point of 2) is that I often want to introduce references that I use in
a given module at the level of a docstring, but then want to refer to
those documents in specific function docstrings.

(Good thing we don't have to worry about garbage collection with these
circular references =)

> Since [] is only used for lists in Python, we could
> define the RE '\[[a-zA-Z0-9_.]+\]' for our purposes and
> raise an exception in case the enclosed reference cannot
> be mapped to a symbol in the global namespace (note: no
> whitespace, no commas) which either evaluates to a function,
> method, module or reference object.
> 
> Doc strings like "...use [None]*10 as argument..." will fail,
> but are easily avoided by inserting some extra whitespace, e.g.
> "...use [ None ] * 10 as argument...".

I like that bit, especially since the 'complete' tagging of that example
would wrap [None]*10 in whatever inline code markup is chosen.

--david


From da@ski.org  Tue Nov 30 17:28:35 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 09:28:35 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <E11soNF-0001fY-00@lsls4p>
Message-ID: <Pine.WNT.4.04.9911300927570.217-100000@rigoletto.ski.org>

On Tue, 30 Nov 1999, Edward Welbourne wrote:

> Tibs said:
> > David (Ascher) - is it time to re-release your initial "docstring
> > grammar"
> and I confess that's something I'd like to see too.
> After all, we have to have someone to play Gdo ...

I must have missed Tibs' posting.  I agree, and I'll try to do that ASAP.

--david


From da@ski.org  Tue Nov 30 17:35:30 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 09:35:30 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <007b01bf3b0e$45cdcf40$0501a8c0@bobcat>
Message-ID: <Pine.WNT.4.04.9911300933110.263-100000@rigoletto.ski.org>

On Tue, 30 Nov 1999, Mark Hammond wrote:

> I more had in mind:
> 
> if sys.doc_building:
>   # Normally critical we do this.
>   dont_do_something_really_expensive()
>
> We dont need to execute the bulk of the code, just import the module
> and get a few of the symbols.

But lots of modules currently do everything in the leftmost column
(they're called "scripts" =).  Some of them never end (they're called "
"daemons" =).  I don't want to force someone to take their 'global' code
and put it in a function just to get around the docstring tool.  Anyway,
the point is moot, as one or the other solution will work, depending on
the script.

--david


From Edward Welbourne <eddyw@lsl.co.uk>  Tue Nov 30 17:34:28 1999
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Tue, 30 Nov 1999 17:34:28 +0000
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <3844023C.41B12CDD@lemburg.com>
References: <Pine.WNT.4.04.9911290944390.263-100000@rigoletto.ski.org>
 <006b01bf3ae3$25b6fe00$0501a8c0@bobcat> <E11soNF-0001fY-00@lsls4p>
 <3844023C.41B12CDD@lemburg.com>
Message-ID: <E11srAa-0002Kr-00@lsls4p>

> Since [] is only used for lists in Python, we could
> define the RE '\[[a-zA-Z0-9_.]+\]' for our purposes and
> raise an exception in case the enclosed reference cannot
> be mapped to a symbol in the global namespace (note: no
> whitespace, no commas) which either evaluates to a function,
> method, module or reference object.

umm ... hang on, two things seem stirred up here.  The proposal I
remember from ages ago and tried to echo has [token] and the token
doesn't have to be intelligible to the python engine: elsewhere in the
doc string, we'll have

References:
   [token] reference text

which the parsed docstring uses to decode each use of [token] that
appeared in the docstring.  Here, reference would normally be something
recognised by the python engine (and would be the thing I understand you
to be putting in [brackets]), but the Reference-handler might also cope
with it being, e.g., an URL.  The text that ends the reference becomes
the text of the `anchor' generated: 

-> ... and tried to echo has <a href="reference">text</a> and the token ...

note non-appeareance of [token] in the digested form: but if `text' had
been omitted from the Reference spec, [token] is the default text
(e.g. when what you're doing really is a citation and that's just how
you want it to appear).  Then any uses of [None] that appear in your doc
string, meaning `the list with one entry, None', it suffices that your
References section doesn't have an entry for [None] - the parsed
docstring will then just say [None] (and not even attempt to wrap an
anchor round it).

The only real relevance to forbidding [spaces within] the citation token
is to ensure that where authors use [square brackets] for parenthetical
remarks or as list denotations, the parser hasn't got to do the piece of
jiggery-pokery that marks it as `maybe a xref' and obliges it to come
back later to settle the maybe once it knows.  This cost will remain for
[None], but it'll be well-defined that the parser marks it as a maybe,
discovers that it isn't and settles on it being just text, not a reference.

Now, it seems to me that what you were describing was slightly different ...
am I merely confused ?

	Eddy.


From da@ski.org  Tue Nov 30 18:02:30 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 10:02:30 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <000b01bf3b07$a465ad40$c92d153f@tim>
Message-ID: <Pine.WNT.4.04.9911300959380.263-100000@rigoletto.ski.org>

On Tue, 30 Nov 1999, Tim Peters wrote:

> Luckily, it almost fits your definition of a paragraph already.  It
> shouldn't be any real effort to declare that ">>>" introduces a
> structureless code paragraph extending until the next all-whitespace etc --
> given that it's a format for Python docstrings, Python's own output deserves
> some special treatment <wink>.

The only question I suppose is whether one should require a keyword (Test:
or other) to keep the top-level syntax trivial, or special-case the
recognition of >>>-beginning paragraphs.

I'm leaning for the former, as it can evolve to the latter if there is
sufficient call for it from the user base, and I think it does keep the
code simpler.  But I'm willing to be swayed.

--david


From uche.ogbuji@fourthought.com  Tue Nov 30 18:07:51 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 30 Nov 1999 11:07:51 -0700
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: Your message of "Sat, 27 Nov 1999 09:17:37 PST."
 <Pine.WNT.4.05.9911270912070.186-100000@david.ski.org>
Message-ID: <199911301807.LAA01801@localhost.localdomain>

> > Are you serious about the above ??? Noone is going to write that
> > in his docstrings...
> 
> It's not my favorite, but Uche mentioned that XML-ish syntax is much
> easier to parse.  While I don't really grant that point (or rather I think
> that the hill needs to be climbed once for all), I want to emphasize:

Huh?  Where? What? When? WHO?

I'm sure I _explicitly_ said that XML in doc-strings is a bad idea.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From da@ski.org  Tue Nov 30 18:12:03 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 10:12:03 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <199911301807.LAA01801@localhost.localdomain>
Message-ID: <Pine.WNT.4.04.9911301007330.263-100000@rigoletto.ski.org>

On Tue, 30 Nov 1999 uche.ogbuji@fourthought.com wrote:

> > It's not my favorite, but Uche mentioned that XML-ish syntax is much
> > easier to parse.  While I don't really grant that point (or rather I think
> > that the hill needs to be climbed once for all), I want to emphasize:
> 
> Huh?  Where? What? When? WHO?
> 
> I'm sure I _explicitly_ said that XML in doc-strings is a bad idea.

Indeed.  I was referring to the bit where you said:

> The reality, though, is that it's easier to go from XML or TeX to any
> of the many formats Python users want than it would be from
> Jim-Fulton-David-Ascher pythonic documentation format.  

I apologize if I misunderstood or misquoted you.

I believe that the docstring syntax being discussed would be mappable to
some form of XML, which I think is what we all agree is a good idea, so
that the docstrings can be used to build library docs.  Please jump in if
something is apparently agreed to which would make this hard!

--david


From friedrich@pythonpros.com  Tue Nov 30 18:24:50 1999
From: friedrich@pythonpros.com (Robin Friedrich)
Date: Tue, 30 Nov 1999 12:24:50 -0600
Subject: [Doc-SIG] docstring grammar
References: <Pine.WNT.4.04.9911290944390.263-100000@rigoletto.ski.org>           <006b01bf3ae3$25b6fe00$0501a8c0@bobcat> <E11soNF-0001fY-00@lsls4p>           <3844023C.41B12CDD@lemburg.com> <E11srAa-0002Kr-00@lsls4p>
Message-ID: <004901bf3b60$30f49de0$f25728a1@UNITEDSPACEALLIANCE.COM>

Ed is correct.

Gendoc solved the HREF problem with:
"...An addition was made to support hypertext references. Hypertext
references are marked with double quotes in the body of the doc string. At
the end of the doc string will be a matching line starting with two dots '..
' and a space followed by the same quoted text and then followed by the
mapping (URL). This is patterned after the footnote notion in setext but is
easier on the eyes. For example, "Pythonland" will be marked as a
hyper-references to Python.org. If no matching trailing reference is found
then nothing is done. "

Which might be modified with current thinking to yield:
"""
Marking refs with [brackets], and at the end of the doc string place the
annotations ala bibliography one per line. Key "brackets" is placed in the
local namespace and used by other (lower) doc strings. In the gendoc
implementation if the key doesn't match anything stored in the ref mapping
no markup in done, so that things like [None]*5 are safe and no exception
need be raised.

[brackets] -> http://www.howto.python.org/rtfm.html
"""
-Robin

----- Original Message -----
From: Edward Welbourne <eddyw@lsl.co.uk>
To: M.-A. Lemburg <mal@lemburg.com>
Cc: <mhammond@skippinet.com.au>; 'David Ascher' <da@ski.org>;
<doc-sig@python.org>
Sent: Tuesday, November 30, 1999 11:34 AM
Subject: Re: [Doc-SIG] docstring grammar


> > Since [] is only used for lists in Python, we could
> > define the RE '\[[a-zA-Z0-9_.]+\]' for our purposes and
> > raise an exception in case the enclosed reference cannot
> > be mapped to a symbol in the global namespace (note: no
> > whitespace, no commas) which either evaluates to a function,
> > method, module or reference object.
>
> umm ... hang on, two things seem stirred up here.  The proposal I
> remember from ages ago and tried to echo has [token] and the token
> doesn't have to be intelligible to the python engine: elsewhere in the
> doc string, we'll have
>
> References:
>    [token] reference text
>
> which the parsed docstring uses to decode each use of [token] that
> appeared in the docstring.  Here, reference would normally be something
> recognised by the python engine (and would be the thing I understand you
> to be putting in [brackets]), but the Reference-handler might also cope
> with it being, e.g., an URL.  The text that ends the reference becomes
> the text of the `anchor' generated:
>
> -> ... and tried to echo has <a href="reference">text</a> and the token
...
>
> note non-appeareance of [token] in the digested form: but if `text' had
> been omitted from the Reference spec, [token] is the default text
> (e.g. when what you're doing really is a citation and that's just how
> you want it to appear).  Then any uses of [None] that appear in your doc
> string, meaning `the list with one entry, None', it suffices that your
> References section doesn't have an entry for [None] - the parsed
> docstring will then just say [None] (and not even attempt to wrap an
> anchor round it).
>
> The only real relevance to forbidding [spaces within] the citation token
> is to ensure that where authors use [square brackets] for parenthetical
> remarks or as list denotations, the parser hasn't got to do the piece of
> jiggery-pokery that marks it as `maybe a xref' and obliges it to come
> back later to settle the maybe once it knows.  This cost will remain for
> [None], but it'll be well-defined that the parser marks it as a maybe,
> discovers that it isn't and settles on it being just text, not a
reference.
>
> Now, it seems to me that what you were describing was slightly different
...
> am I merely confused ?
>
> Eddy.
>
> _______________________________________________
> Doc-SIG maillist  -  Doc-SIG@python.org
> http://www.python.org/mailman/listinfo/doc-sig


From friedrich@pythonpros.com  Tue Nov 30 19:22:55 1999
From: friedrich@pythonpros.com (Robin Friedrich)
Date: Tue, 30 Nov 1999 13:22:55 -0600
Subject: [Doc-SIG] docstring grammar
References: <Pine.WNT.4.04.9911300959380.263-100000@rigoletto.ski.org>
Message-ID: <005301bf3b68$4df4f540$f25728a1@UNITEDSPACEALLIANCE.COM>

----- Original Message -----
From: David Ascher <da@ski.org>
To: <doc-sig@python.org>
Sent: Tuesday, November 30, 1999 12:02 PM
Subject: RE: [Doc-SIG] docstring grammar


> On Tue, 30 Nov 1999, Tim Peters wrote:
>
> > Luckily, it almost fits your definition of a paragraph already.  It
> > shouldn't be any real effort to declare that ">>>" introduces a
> > structureless code paragraph extending until the next all-whitespace
etc --
> > given that it's a format for Python docstrings, Python's own output
deserves
> > some special treatment <wink>.
>
> The only question I suppose is whether one should require a keyword (Test:
> or other) to keep the top-level syntax trivial, or special-case the
> recognition of >>>-beginning paragraphs.
>
> I'm leaning for the former, as it can evolve to the latter if there is
> sufficient call for it from the user base, and I think it does keep the
> code simpler.  But I'm willing to be swayed.
>
> --david
<sway>
I would rather minimize the invention (and consequential memorization) of
special keywords. Parsing them is not made quite as trivial as it seems
(especially when alternate languages are involved). Structured text had the
favorable trait of being very easy to remember. Parsers are built using
formal definition of special case rules anyway. Where special casing based
on context becomes non-obvious to remember is where I would draw the line
and resort to literal keywords.
</sway>
-Robin


From fdrake@acm.org  Tue Nov 30 19:27:08 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 30 Nov 1999 14:27:08 -0500 (EST)
Subject: [Doc-SIG] Party!
Message-ID: <14404.9484.505714.922927@weyr.cnri.reston.va.us>

  Well, I turn my back for a few days of turkey-feasting and
kid-chasing, and what do I find when I turn back around?
  Great party on the list!  I'll try and actually read this before I
write too many posts.  ;-)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From da@ski.org  Tue Nov 30 19:58:05 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 11:58:05 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <004901bf3b60$30f49de0$f25728a1@UNITEDSPACEALLIANCE.COM>
Message-ID: <Pine.WNT.4.04.9911301148330.263-100000@rigoletto.ski.org>

On Tue, 30 Nov 1999, Robin Friedrich wrote:

> """
> Marking refs with [brackets], and at the end of the doc string place the
> annotations ala bibliography one per line. Key "brackets" is placed in the
> local namespace and used by other (lower) doc strings. In the gendoc
> implementation if the key doesn't match anything stored in the ref mapping
> no markup in done, so that things like [None]*5 are safe and no exception
> need be raised.
> 
> [brackets] -> http://www.howto.python.org/rtfm.html
> """

Nicely said.  I'd like to point out that the transformation I had in mind
is in fact, given the above and an HTML output:

[brackets] -> <a href="http://www.howto.python.org/rtfm.html">brackets</a>

In other words the keyword is kept until the rendering stage. I suppose
that it might be necessary to allow the reference to define a different
bit of text to render instead of the keyword.

So given:

  """
  ...
  References:

     PythonDotOrg: 
       Text: "Python's Main Website"
       Link: http://www.python.org
  """

we could have:

[PythonDotOrg] -> <a href="http://www.python.org">Python's main website</a>

Or not.  Luckily I think that issue can be left to the 'bibliography
engine', just like the bullet processing can be left to the 'list engine'.

--david

PS: I would suggest that the 'if no key exists, no markup is done'
    behavior be modifiable at runtime to 'a warning is emitted', as I
    think that this sort of silent behavior is problematic given the
    presence of typos in the world.


From fdrake@acm.org  Tue Nov 30 20:28:18 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 30 Nov 1999 15:28:18 -0500 (EST)
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <004101bf3a69$535a09d0$0501a8c0@bobcat>
References: <E11sPCi-00056h-00@lsls4p>
 <004101bf3a69$535a09d0$0501a8c0@bobcat>
Message-ID: <14404.13154.994319.599412@weyr.cnri.reston.va.us>

Mark Hammond writes:
 > Me too - thanks Fred!  The doc is excellent and a thankless task!

  You're welcome!

  (Wow, all that talk about how thankless the job is, and this is the
first thank-you in the thread!  One free doc download for Mark!  ;)


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives


From skip@mojam.com (Skip Montanaro)  Tue Nov 30 20:39:17 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 30 Nov 1999 14:39:17 -0600 (CST)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <Pine.WNT.4.04.9911301148330.263-100000@rigoletto.ski.org>
References: <004901bf3b60$30f49de0$f25728a1@UNITEDSPACEALLIANCE.COM>
 <Pine.WNT.4.04.9911301148330.263-100000@rigoletto.ski.org>
Message-ID: <14404.13813.670091.131230@dolphin.mojam.com>

In David's original proposal he wrote:

    For compatibility with Guido, IDLE and Pythonwin (and increasing the
    likelihood that the proposal will be accepted by GvR), the docstrings of
    callables must follow the following convention established in Python's
    builtins:
    
         >>> print len.__doc__
         len(object) -> integer
    
         Return the number of items of a sequence or mapping.
    
    In other words, the first paragraph must fit on a line, repeat the name
    of the callable, with a 'wordy' signature, the ' -> ' string, and the
    type of the return value.

Chiming in rather late.  Perhaps this was already discussed, but I didn't
see it in the immediate followups to David's original proposal...

The one complaint I have with the wordy signature is that it partially types
the function.  It specifies a return type, but not the input parameter
types.  Why go only halfway?  I suggest you either use type names for
parameters and return value or annotate the parameter names with types:

    len(o:sequence) -> IntType

There should be a couple shorthands, for instance, using "sequence",
"mapping" or "number" to represent objects that exhibit the given behavior,
or "object" to represent an arbitrary (untyped) parameter or return value.
Otherwise, I'd suggest the types be the names defined by the types module.
Of course, I'm ignoring the types of the elements of aggregate types.  I'll
let someone smarter make a more concrete proposal in this regard.

Why worry about this?  Well, people have been asking over and over for type
information.  This looks parseable to me, doesn't change the language, yet
could be used by a type inferencer, "safer" compiler or other type-oriented
tools.

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...


From friedrich@pythonpros.com  Tue Nov 30 20:38:45 1999
From: friedrich@pythonpros.com (Robin Friedrich)
Date: Tue, 30 Nov 1999 14:38:45 -0600
Subject: [Doc-SIG] docstring grammar
References: <Pine.WNT.4.04.9911301148330.263-100000@rigoletto.ski.org>
Message-ID: <006901bf3b72$f2ca06a0$f25728a1@UNITEDSPACEALLIANCE.COM>

From: David Ascher <da@ski.org>
> On Tue, 30 Nov 1999, Robin Friedrich wrote:
>
> > """
> > Marking refs with [brackets], and at the end of the doc string place the
> > annotations ala bibliography one per line. Key "brackets" is placed in
the
> > local namespace and used by other (lower) doc strings. In the gendoc
> > implementation if the key doesn't match anything stored in the ref
mapping
> > no markup in done, so that things like [None]*5 are safe and no
exception
> > need be raised.
> >
> > [brackets] -> http://www.howto.python.org/rtfm.html
> > """
>
> Nicely said.  I'd like to point out that the transformation I had in mind
> is in fact, given the above and an HTML output:
>
> [brackets] -> <a href="http://www.howto.python.org/rtfm.html">brackets</a>

grumble grumble...see below.
>
> In other words the keyword is kept until the rendering stage. I suppose
> that it might be necessary to allow the reference to define a different
> bit of text to render instead of the keyword.

Why? keywords are arbitrary strings. (may include spaces, etc.)
>
> So given:
>
>   """
>   ...
>   References:
>
>      PythonDotOrg:
>        Text: "Python's Main Website"
>        Link: http://www.python.org
>   """
>
> we could have:
>
> [PythonDotOrg] -> <a href="http://www.python.org">Python's main
website</a>
>
> Or not.  Luckily I think that issue can be left to the 'bibliography
> engine', just like the bullet processing can be left to the 'list engine'.

Yes. However I really don't like the idea of HTML finding its way into the
doc string. The BiblioEngine would be told the information of the reference
and, along with what rendering mode she is in, emit the appropriate output
format, be it HTML, XML, PDF, etc.
>
> --david
>
> PS: I would suggest that the 'if no key exists, no markup is done'
>     behavior be modifiable at runtime to 'a warning is emitted', as I
>     think that this sort of silent behavior is problematic given the
>     presence of typos in the world.

Agreed.


From da@ski.org  Tue Nov 30 20:56:10 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 12:56:10 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <006901bf3b72$f2ca06a0$f25728a1@UNITEDSPACEALLIANCE.COM>
Message-ID: <Pine.WNT.4.04.9911301251060.263-100000@rigoletto.ski.org>

On Tue, 30 Nov 1999, Robin Friedrich wrote:

> > Nicely said.  I'd like to point out that the transformation I had in mind
> > is in fact, given the above and an HTML output:
> >
> > [brackets] -> <a href="http://www.howto.python.org/rtfm.html">brackets</a>
> 
> grumble grumble...see below.
> >
> > In other words the keyword is kept until the rendering stage. I suppose
> > that it might be necessary to allow the reference to define a different
> > bit of text to render instead of the keyword.
> 
> Why? keywords are arbitrary strings. (may include spaces, etc.)

We should watch our language =).  Keywords in my proposal are things
before :'s which lead a paragraph and cannot contain whitespaces. Maybe we
don't need that restrictions on things in []'s.

> >   References:
> >
> >      PythonDotOrg:
> >        Text: "Python's Main Website"
> >        Link: http://www.python.org

> Yes. However I really don't like the idea of HTML finding its way into
> the doc string. The BiblioEngine would be told the information of the reference
> and, along with what rendering mode she is in, emit the appropriate output
> format, be it HTML, XML, PDF, etc.

I don't recall putting HTML in the docstring.  Just a URL.


From da@ski.org  Tue Nov 30 21:01:23 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 13:01:23 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <14404.13813.670091.131230@dolphin.mojam.com>
Message-ID: <Pine.WNT.4.04.9911301257070.263-100000@rigoletto.ski.org>

On Tue, 30 Nov 1999, Skip Montanaro wrote:

> The one complaint I have with the wordy signature is that it partially types
> the function.  It specifies a return type, but not the input parameter
> types.  Why go only halfway?  I suggest you either use type names for
> parameters and return value or annotate the parameter names with types:
> 
>     len(o:sequence) -> IntType

I propose to defer this discussion.  I think it's a fine idea in general,
but raises a whole bunch of issues, and mixes with other threads like
typing etc.  Furthermore, the current uses of this first line (popups in
IDLE and Pythonwin) might suffer from a significant lengthening of said
line.  Getting the type information in the docstring is however a worthy
goal, but perhaps best left for a subsection:

  Arguments:
     o (sequence) -- an arbitrary sequence object
     
I'd like to finalize the top-level structure, get it in front of GvR's
eyeballs, and then we can tackle each subtopic (so far: list processing,
reference handling, signature, mandatory keywords, keyword registration
process, multilingual keyword support, etc.) at a later date.

--david


From friedrich@pythonpros.com  Tue Nov 30 21:33:42 1999
From: friedrich@pythonpros.com (Robin Friedrich)
Date: Tue, 30 Nov 1999 15:33:42 -0600
Subject: [Doc-SIG] docstring grammar
References: <Pine.WNT.4.04.9911301251060.263-100000@rigoletto.ski.org>
Message-ID: <008501bf3b7a$93855160$f25728a1@UNITEDSPACEALLIANCE.COM>

My bad.
----- Original Message -----
From: David Ascher <da@ski.org>
> > > [brackets] -> <a
href="http://www.howto.python.org/rtfm.html">brackets</a>

I was interpreting the above as a doc string rewrite of my
[brackets] -> http://www.howto.python.org/rtfm.html
*in* the doc string.  Sorry.

> > Why? keywords are arbitrary strings. (may include spaces, etc.)
>
> We should watch our language =).  Keywords in my proposal are things
> before :'s which lead a paragraph and cannot contain whitespaces. Maybe we
> don't need that restrictions on things in []'s.
>
> > >   References:
> > >
> > >      PythonDotOrg:
> > >        Text: "Python's Main Website"
> > >        Link: http://www.python.org

Hmmm.  Gosh we need a glossary quick! Yup, we had different notions of
"keyword".
Do you really want arbitrary DAkeywords (stuff before colons) usable for
internal/external references?  Since this confused me, I might conclude that
it would confuse others as well.
I would have placed the following in my doc string and been satisfied...
""".....
    For further information visit:
        [Python Language Web Site] is the main source for Python itself.
        [Starship Python] houses a number of Python user resources.

[Python Language Web Site] -> http://www.python.org
[Starship Python] -> http://starship.python.net
"""
Intuitively I don't think of the word "visit" as a keyword that can be
referenced, while anything in brackets seems fair game. What other features
did you have in mind?
Dejavu'ly yours,
Robin


From uche.ogbuji@fourthought.com  Tue Nov 30 22:01:04 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 30 Nov 1999 15:01:04 -0700
Subject: [Doc-SIG] docstring grammar
In-Reply-To: Your message of "Sun, 28 Nov 1999 16:57:03 PST."
 <Pine.WNT.4.05.9911281649360.202-100000@david.ski.org>
Message-ID: <199911302201.PAA02300@localhost.localdomain>

> Proposed format for docstrings:
> 
>   The whitespace at the beginning of a docstring is ignored.
> 
>   Paragraphs are separated by one or more blank lines.
> 
>   For compatibility with Guido, IDLE and Pythonwin (and increasing the
>   likelihood that the proposal will be accepted by GvR), the
>   docstrings of callables must follow the following convention
>   established in Python's builtins:
> 
>        >>> print len.__doc__
>        len(object) -> integer
> 
>        Return the number of items of a sequence or mapping.

The only thing I'd _maybe_ suggest in order to allow some structure is to 
eliminate the non-keyword sections:

        >>> print len.__doc__
        sig:: len(object) -> integer
 
        desc:: Return the number of items of a sequence or mapping.

I know this loses a bit from the point of view of the user's readability, but 
it would provide some structure which increases the author's flexibility, and 
makes conversion to "library format" easier.

Otherwise, your proposal seems a good start.

> Miscellaneous Thoughts:
> 
>   I chose double-colon notation for keywords so that one can have text
>   paragraphs which match the 'word:' notation without having them be
>   interpreted as keywords.

There are other conventions that would work, but '::' is as good as any.

>   Does this proposal make docstrings whitespace-heavy -- the
>   requirement to break each paragraph with a line of whitespace
>   means that a lot of lines are blank, especially when doing
>   'bulleted lists'

I would suggest dropping the requirement, which can be done if everything is 
keyword-modified.

>   The above was (quickly) written with parsing in mind.  Is it really
>   easily parseable?  If not, what needs to be changed so that it is
>   parseable?

I see no major parsing problems.  Bullets might be a bit of a bore, but 
nothing to kill progress.

>   Are there normal uses in docstrings where one wants to turn off the
>   automatic link detection?

I think we can come up with a basic escaping mechanism for this.  Maybe by 
preceding not-to-be-processed URLs and link keywords with '!'.

>   Is there value in having string interpolation?  David Arnold mentioned
> 
>        __version__ = "$Revision$[11:-2]
>        __date__ = "$Date$

I'd say leave this to a later version.

> PS: It goes without saying that while I railed against design by
> committee, I am of course hopeful for feedback, for technical reasons
> (dummy, you forgot special cases X, Y and Z!) and because I realize that a
> standards proposal needs at least broad agreement if not consensus to be
> effective in the long run.  The sharper-eyed will note that I stacked the
> deck in my favor in the above proposal by including what Guido does
> naturally as valid in the proposed grammar.

Damn the politics.  Full speed ahead.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


From da@ski.org  Tue Nov 30 22:06:39 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 14:06:39 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <008501bf3b7a$93855160$f25728a1@UNITEDSPACEALLIANCE.COM>
Message-ID: <Pine.WNT.4.04.9911301344390.263-100000@rigoletto.ski.org>

On Tue, 30 Nov 1999, Robin Friedrich wrote:

> Hmmm.  Gosh we need a glossary quick! Yup, we had different notions of
> "keyword".

  A keyword is a case-sensitive string which:
      - starts a paragraph
      - matches  '^ *[a-zA-Z_]+[\-a-zA-Z_0-9]*: +' 
        (Python identifiers with the addition of hyphens and which end
        with a : and one or more spaces)

As (I think it was) Tibs mentioned, it's syntactic sugar for XML
notation, with the same aim of making a 'labeled' hierarchy.  Maybe the
word 'Label' is better.

  Foo:
    this is the body of foo
    which spans multiple lines

is isomorphic to

  <Foo>
  this is the body of foo
  which spans multiple lines
  </Foo>

> Do you really want arbitrary DAkeywords (stuff before colons) usable for
> internal/external references?  Since this confused me, I might conclude that
> it would confuse others as well.

No.  I intend only the DAKeywords listed in a special "References:"
section to be available as the targets of references (see below).

> I would have placed the following in my doc string and been satisfied...
> """.....
>     For further information visit:
>         [Python Language Web Site] is the main source for Python itself.
>         [Starship Python] houses a number of Python user resources.
> 
> [Python Language Web Site] -> http://www.python.org
> [Starship Python] -> http://starship.python.net
> """

This is, I would assume, harder to parse -- you must have some implicit
rules in there regarding which [Starship Python] is a 'mention of
something else' and which is a 'this is the thing I mentioned'.  Is it the
sequential order, the 0-indent?

My vision for the same semantics as above was:

 """.....
      For further information visit:
         [PythonLanguageWebSite] is the main source for Python itself.
         [StarshipPython] houses a number of Python user resources.

      References:
         PythonLanguageWebSite:  http://www.python.org
         StarshipPython: http://starship.python.net
 """

Which leaves open the question of how we can have 'space-enabled' labels
for references which can't have spaces in them.  

One idea is to tag the [] markup with a ="stringlabel":

         [PythonLanguageWebSite="The Python.org website"] is the main
          source for Python itself.

Another possibility hinted at previously is to enrich the References
section:

     References:
        PythonLanguageWebSite:
          Label: The Python.org website
          Link: http://www.python.org

either of which, when rendered, would 'do the right thing.  I only expect
this to be an issue when referring to URLs.  Python modules, classes and
functions already have perfectly good names.  For things which are more
like *real* bibliographic references, I'd be just as happy with the
conventional [keyword] notation seen in many CS papers.

     See [ascher29] for the source of the algorithm.

     References:
       ascher29: My famous Ph.D. Dissertation, Foo University, 2029.

which would get rendered just the way it looks on your screen even in a
printed format.

> Intuitively I don't think of the word "visit" as a keyword that can be
> referenced, while anything in brackets seems fair game. What other features
> did you have in mind?

I don't understand the above paragraph.  The word 'visit' isn't a
DAKeyword because it wasn't starting a paragraph.

--david

PS: I'm working on updating the proposal, but I have other pressing
    deadlines (such as getting the JPython tutorial ready for IPC8!), so
    it may not be ready for a couple of days.


From uche.ogbuji@fourthought.com  Tue Nov 30 22:27:16 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 30 Nov 1999 15:27:16 -0700
Subject: [Doc-SIG] docstring grammar
In-Reply-To: Your message of "Mon, 29 Nov 1999 09:50:49 GMT."
 <000701bf3a4f$375a85d0$f0c809c0@lslp7o.lsl.co.uk>
Message-ID: <199911302227.PAA02358@localhost.localdomain>

> David Ascher wrote:
> >   Paragraphs are separated by one or more blank lines.
> 
> As you say later on, I think this does cause some over-use of whitespace...

Agreed.  Let's kill them.

> >   Characters between # signs and the end of the line are stripped by
> >   the docstring parser.
> 
> This is a Bad Thing - I have quite often needed to discuss things in doc
> strings which include use of the "#" character - not least if I'm parsing a
> little language that uses "#" as its comment character! So losing stuff thus
> would be difficult. Either (a) why do we need comments in doc strings, or
> (b) provide a way to escape the "#" character.

I forgot to mention this in my original reply.  I also think that this is a 
bad idea.  I don't think we need meta-comments for the doc-strings.  I don't 
like the idea even if we find a way to escape '#'.

> but the above gets oververbose. I suppose one could instead use a list
> syntax:
> 
> 	Contributors::
> 		- John Doe
> 		- Ronald Reagan
> 		- Francois Mitterand

Yes, and this goes with what David had in his proposal about bullets.

> since I don't see the ambiguity in allowing the omission of the vertical
> whitespace here, *if* one allows that some care would be needed with
> hyphenation! (i.e., one can't allow one's hyphens to start a line, which is
> awkward but probably not too bad). Another possibility might be to allow
> "Python list" syntax - I started off disliking this, but over the last few
> minutes it has grown on me:
> 
> 	Contributors::
> 		[ John Doe,
> 		  Ronald Reagan,
> 		  Francois Mitterand ]
> 
> (again, highjacking Python's syntax).

Again as long as we don't go having meta-compilation in the first version of 
the system.

> No, on thinking about it, I would vote for either:
> 
> 	1) use of white space as David proposes
> 	   (pro: utter simplicity,
> 	    con: doesn't quite look as nice as I'd like)
> 	2) allow Python list syntax
> 	   (pro: emphasises this is for short lists,
> 	    con: a bit odd)
> 	3) detect bullet characters at the "start of line"
> 	   (pro: still fairly simple,
> 	    con: one has to take care about, e.g., dashes in text)
> 	   Ah - I just realised that negative numbers at the start of a line
> 	   probably kill that one...

This one is also a bit ugly, but how about a hybrid:

List
	[
	* item 1
	* item 2
		[
		* sub-item 1
		* sub-item 2
		]
	* item 3
	]


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org