From jbkerr@sr.hp.com Sat Nov 6 01:02:47 1999 From: jbkerr@sr.hp.com (James Kerr) Date: Fri, 5 Nov 1999 17:02:47 -0800 (PST) Subject: [Doc-SIG] a wish list Message-ID: <199911060102.RAA02700@joplin.sr.hp.com> I've been following the messages in this group for a few weeks now. It seems as though a lot of attention is being focused on the details of the markup language. This is certainly an important issue, but maybe the *how* of the markup language will be easier to decide once the *why* is understood. For whatever it's worth, I've outlined the ways I would like to use Python documentation in an ideal world. These items are listed from most important to least important. Maybe this will provide some hints about the best way to implement markup. BTW, I don't know how much of this has been implemented in the latest version of IDLE, so some of what follows may already be old hat. Also, I haven't much bothered to distinguish between usage in a GUI environment, vs. an Emacs-like environment that is text-based but mouse-aware, vs. a command-line environment. * Search documentation by keyword or phrase, in any combination of Library Reference, FAQ, module descriptions, index, etc. The focus is on getting a quick answer to a specific question, not on broad topical navigation. Example: I'm a recent Python convert, I'm typing in my first script, and I want quick answers to questions like - how do I access command-line options? - how do I convert a string to an int? - what is the list of built-in operators? Preferred solution: type queries like these into a dialog box and get references to specific sections of documentation. Acceptable alternatives: - consult a permuted index - do a keyword query instead of a free-form query Comments: If a semantically-based search algorithm is too hard to write, a really good permuted index might be useful. All one-line class and function summaries could be part of this index, as could FAQ entries, annotations in the library reference, etc. All it would take is a small army of volunteers to go through the documentation and insert index entries at all relevant locations. * View all available classes, either alphabetically or hierarchically. Preferred solution: A presentation that used indentation to show parent/child relationships, with links from class names to both documentation and source code. Some allowances would have to be made for multiple inheritance. Other niceities (available only in a GUI environment): - pause the cursor over a class reference, and see a 1-line summary of the class in a popup window or status line. - expand a class (via some kind of mouse or keyboard operation), and see a list of all functions defined in the class. - pause over a function reference, and see a summary of the function in a popup or status line. - jump to class or function documentation (or source code) from the listing. * View all the methods that are available in a class. This is a little different from the previous item, since inheritance allows you to call functions that are defined in a superclass. Preferred solution: both of the following ~ - a per-class view, that displays the inheritance tree from the selected class on upward. The functions defined in each class are displayed along with the class. - a summary view, that just shows the class you're interested in, and uses some kind of annotation to distinguish which methods are defined in that class, and which are defined in superclasses. Acceptable alternative: either of the above. * Maintain a list of bookmarks. Preferred solution: an easy-to-use tool that allows you to - bookmark locations precisely (i.e. down to line-in-a-document precision). - attach symbolic names to groups of bookmarks (for example, "Object Database Support") - adjust your bookmarks transparently when new documentation rolls out. Acceptable alternative: devise a scheme that allows a user to define his/her own bookmark files with a text editor. Comments: I often find myself using a small portion of online docs quite heavily when doing a project. It's kind of a pain to have to jump back and forth between documents (or parts of documents) to get information. The ability to provide some kind of personalized view of documentation would be a real plus. Hope these challenges aren't too trivial to be interesting ;-) In all seriousness, I think it's great that so much attention is being given to the documentation effort, because it really could be central to the success of Python. -Jim -- Jim Kerr Agilent Technologies 1400 Fountaingrove Pkwy, MS 3USZ Santa Rosa, CA 95403 From mhammond@skippinet.com.au Sat Nov 6 02:25:33 1999 From: mhammond@skippinet.com.au (Mark Hammond) Date: Sat, 6 Nov 1999 13:25:33 +1100 Subject: [Doc-SIG] a wish list In-Reply-To: <199911060102.RAA02700@joplin.sr.hp.com> Message-ID: <004a01bf27fe$346405d0$0501a8c0@bobcat> James writes: > For whatever it's worth, I've outlined the ways I would like to > use Python documentation in an ideal world. These items are listed > from most important to least important. Maybe this will provide some > hints about the best way to implement markup. I cant argue with anything you have written - although what Python needs (and this effort is no different) are less good ideas, and more code. Im afraid there is no army of coders just sitting around waiting for good ideas to implement. Why not tackle even one of your wishlist yourself, and present some code? Then any issues you have with the current markup scheme are more likely to be listened to, as we then have _something_ to help put it into context, rather than an abstract concept about what is may happen to be good if anything is ever implemented... Mark. From irmina@ctv.es Sat Nov 6 14:07:40 1999 From: irmina@ctv.es (Manuel Gutierrez Algaba) Date: Sat, 6 Nov 1999 14:07:40 +0000 (GMT) Subject: [Doc-SIG] a wish list In-Reply-To: <004a01bf27fe$346405d0$0501a8c0@bobcat> Message-ID: On Sat, 6 Nov 1999, Mark Hammond wrote: > James writes: > > For whatever it's worth, I've outlined the ways I would like to > > use Python documentation in an ideal world. These items are listed > > from most important to least important. Maybe this will provide some > > hints about the best way to implement markup. > > I cant argue with anything you have written - although what Python > needs (and this effort is no different) are less good ideas, and more > code. > > Im afraid there is no army of coders just sitting around waiting for > good ideas to implement. Why not tackle even one of your wishlist > yourself, and present some code? Then any issues you have with the > current markup scheme are more likely to be listened to, as we then > have _something_ to help put it into context, rather than an abstract > concept about what is may happen to be good if anything is ever > implemented... > I'm aware of the need for comprehensive information and search engines, and I do know that *FEW* people code get involved in large projects that help the others. Because of that I guessed a method of producing lots of information with little effort. A bit of markup is all. I'm doing this for TeX, and in python. In http://www.ctv.es/USERS/irmina/TeEncontreX.html ( the project is GPL and sources are supplied in Sources!) It could be done similarly for python. In fact I posted in Comp.lang.python. But nobody replied. The more I live in Internet, the surer I'm of three facts: - People rarely get involved - People who gets involved usually involve in stupid projects (irc-bots, ftp clients,... always the same) - Interesting people who do interesting projects do so big things that It's difficult for the others to follow. Moreover interesting/ capable people doesn't get involved in esay things. I'm 100% sure that my method of storing information is quite valuable, easy,... But regretfully it seems not to be interesting to anybody. Regards/Saludos Manolo ------------- My addresses / mis direcciones: a="www.ctv.es/USERS/irmina" b=[("Lritaunas Peki Project", ""), ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ), ("page of drawing utility for tex ", "/texpython.htm" ), ("CrossWordsLand","/cruo/cruo.html") ] for i in b: print i[0],":", a+i[1] Let us not look back in anger or forward in fear, but around us in awareness. -- James Thurber From Manuel Gutierrez Algaba Sat Nov 6 14:46:20 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Sat, 6 Nov 1999 14:46:20 +0000 (GMT) Subject: [Doc-SIG] a wish list Part II In-Reply-To: <004a01bf27fe$346405d0$0501a8c0@bobcat> Message-ID: The method described in TeEncontreX is so extremely simple and flexible that: a) You can take (rigth now) FAQ's and other document, attribute them just placing into them commands like \newcommand{\indexsockets}{\index{sockets}}... b) You can mix FAQ's, article, example code... and even you can distinguish them! Attributing: \newcommand{\indexarticle}{\index{article}} c) You can do that a kind of library of available rutines for python d) You can extract data from .py. Ex: class any_sockets_related_class: ... def cool_routine(self,...): """ \indexsockets \indexacoolthingX """ And parse it. Now, what we need for get the BEST and MOST powerful and cohesive system of information is just some hundreds of "ATTRIBUTERS" (people who attribute ), no special need nor knoweledge is need. And the effort is minimum ( write down three or four words, here and there). It's a huge effort, but it's a linear effort and simply straight forward. Besides the attributed information can be rearraged many ways, grouping the attributes... The more I think about it the more genial it seems to me. Just a question for the reader: Why is not implemented? The code is already done. It's simple, it's easy,it's high level... It's a huge effort but, with 100 persons working at it, or 1000 persons working from time to time ( once a week , half an hour) we could get thousand of attributed ( reusable, searchable, efficient... ) information. I, myself, have done a lot of in TeEncontreX, and It's me alone. One person 130 articles, 100 persons 13000 articles. And python stuff is quite more readable that TeX stuff. Regards/Saludos Manolo ------------- My addresses / mis direcciones: a="www.ctv.es/USERS/irmina" b=[("Lritaunas Peki Project", ""), ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ), ("page of drawing utility for tex ", "/texpython.htm" ), ("CrossWordsLand","/cruo/cruo.html") ] for i in b: print i[0],":", a+i[1] Waste not fresh tears over old griefs. -- Euripides From fdrake@acm.org Mon Nov 8 15:24:07 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 8 Nov 1999 10:24:07 -0500 (EST) Subject: [Doc-SIG] a wish list In-Reply-To: <004a01bf27fe$346405d0$0501a8c0@bobcat> References: <199911060102.RAA02700@joplin.sr.hp.com> <004a01bf27fe$346405d0$0501a8c0@bobcat> Message-ID: <14374.60183.988407.503816@weyr.cnri.reston.va.us> Mark Hammond writes: > Im afraid there is no army of coders just sitting around waiting for > good ideas to implement. Why not tackle even one of your wishlist Mark, James had asked me what he could do to help, and his posting is part of that; I *specifically* asked for suggestions regarding how to use the documentation to help me make sure I cover as many reasonable uses of the documentation as I can as I make the conversion to XML. My thought is that this is *the* time to make sure our markup carries over all interesting information (all of it, right?), and is sufficiently well-structured that we don't need to reformat the documents to add additional information to the structure. Thanks, James! -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake@acm.org Mon Nov 8 18:54:11 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 8 Nov 1999 13:54:11 -0500 (EST) Subject: [Doc-SIG] a wish list In-Reply-To: References: <004a01bf27fe$346405d0$0501a8c0@bobcat> Message-ID: <14375.7251.289446.900869@weyr.cnri.reston.va.us> Manuel Gutierrez Algaba writes: > I'm 100% sure that my method of storing information is quite valuable, > easy,... But regretfully it seems not to be interesting to anybody. Manuel, I did actually take a moment to look at it, but I wasn't really sure what you were doing. In response to your more recent note here, I took another look. I downloaded the complete Unix package, and I'm still not quite sure what's going on. (The code is hard to read for those of us who don't know Spanish; sorry.) Can you explain precisely what you're advocating be done? I'm sure it can be explained on a Web page if you'd rather add it to your TeEncontreX site for everyone who looks there, or in a message here, whichever makes more sense for you. Thanks for your input! -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From mhammond@skippinet.com.au Mon Nov 8 21:48:50 1999 From: mhammond@skippinet.com.au (Mark Hammond) Date: Tue, 9 Nov 1999 08:48:50 +1100 Subject: [Doc-SIG] a wish list In-Reply-To: <14374.60183.988407.503816@weyr.cnri.reston.va.us> Message-ID: <008f01bf2a33$0ba14400$0501a8c0@bobcat> Ahh - OK - sorry about that. Im afraid I simply assumed it was yet another "I want", rather than an "I will" mail. My apologies, and I hope you get even a few of these ideas implemented. Mark. > Mark Hammond writes: > > Im afraid there is no army of coders just sitting around > waiting for > > good ideas to implement. Why not tackle even one of your wishlist > > Mark, > James had asked me what he could do to help, and his > posting is part > of that; I *specifically* asked for suggestions regarding how to use > the documentation to help me make sure I cover as many > reasonable uses > of the documentation as I can as I make the conversion to XML. My > thought is that this is *the* time to make sure our markup carries > over all interesting information (all of it, right?), and is > sufficiently well-structured that we don't need to reformat the > documents to add additional information to the structure. > Thanks, James! > > > -Fred > > -- > Fred L. Drake, Jr. > Corporation for National Research Initiatives > From irmina@ctv.es Tue Nov 9 00:54:32 1999 From: irmina@ctv.es (Manuel Gutierrez Algaba) Date: Tue, 9 Nov 1999 00:54:32 +0000 (GMT) Subject: [Doc-SIG] a wish list In-Reply-To: <14375.7251.289446.900869@weyr.cnri.reston.va.us> Message-ID: On Mon, 8 Nov 1999, Fred L. Drake, Jr. wrote: > Manuel, > I did actually take a moment to look at it, but I wasn't really sure > what you were doing. > In response to your more recent note here, I took another look. I > downloaded the complete Unix package, and I'm still not quite sure > what's going on. (The code is hard to read for those of us who don't > know Spanish; sorry.) > Can you explain precisely what you're advocating be done? I'm sure > it can be explained on a Web page if you'd rather add it to your > TeEncontreX site for everyone who looks there, or in a message here, > whichever makes more sense for you. Well, sorry, I was definetely sure that AnalizaToo.py was quite readable... Anyway if you're interested in, I can translate it. Anyway, the core of my proposal is not the programm but the data. The code is just a "formatter" of the data. All this stuff about data has a heavy theoretical base. Let's take a look at a typical article of Too.tex ( my database): .... \jiji 1. How do I change the section headings such that the section number does not appear in boldface? Or make the section number and the section header to be unbolded? \jaja \indexlayout \indexsection titlesec.sty or sectsty.sty. \jiji ... Consider \jiji as delimitors of an article, and \jaja delimitor of parts of an article. \indexlayout is saying that the article is about "text layout". As any concept may be related directly to many other concepts I mark this relationship thus: \newcommand{\indexlayout}{\index{layout}\index{decorations} \index{adornos}} So I basically attribute the articles with keywords. Another example: \jiji \indexnumeracion \indexalphanumeric \indexfootnote Can footnote markers be something other than arabic numerals? \jaja Yes, \renewcommand{\thefootnote}{\alph{footnote}} or \Alph, or \roman, or \Roman, or \fnsymbol This is a general prescription for changing the formatting of one of LaTeX's counters: you redefine \thecounter. \jiji The idea of keywords is not new. But what's not so new, is what happens when we try to put many keywords in small pieces of information and then we glue all that information. As almost any piece of information is very rich ( the example above holds inf about numbering, footnotes and arabics), then the reunion usually have many contact points, ie, imagine that many articles may speak about footnotes (as a main subject or secondary). So if we eventually want to information about footnotes, we'll have a large collection of related stuff. And that collection will be rather significant of the footnote itself. Let's take a look of an example: (sorry again, if you don't understand... Articulo = article, estos son los articulos disponibles these are the available articles, adornos = decoration = stuff to make things prettier ) Estos son los articulos disponibles Articulo 11: adornos decorations secsty titlesec layout Articulo 32: hrule adornos decorations Articulo 39: altura decorations book.sty adornos baseline hbox height cuadro Articulo 48: adornos decorations rcs Articulo 72: adornos decorations final_linea end_of_line Articulo 73: adornos figure decorations Articulo 79: adornos margins decorations Articulo 112: space adornos decorations tabular Articulo 134: adornos decorations book cleardoublepage Articulo 138: adornos decorations caption Articulo 144: adornos decorations space textheight Just watching this, you can learn about the term "decoration". It's something related to space, textheight, book.sty, hrule, hbox... It'd may happen that you could get almost a definition of it ! Well, this scheme lets you refine your search (imagine if you are trying to get some kind of effect in LaTeX), just browsing by the article that is closer to your wishes ( textheight, titlesec), and to learn more about concepts related to "decoration". But, this is just "one scheme" (the one provided by AnalizaToo.py). With the very same Too.tex(database) and if it were big enough you could say: I'd like you to make a book about "decoration in LaTeX",and these are the rules 1 I want to a description about general concepts 2 I want it from the more general to the specific item 3 The more general is "book","space",.... Please, remember than currently Too.tex is a collection of USENET/mailing lists articles, but even so. You'd get a document, whose articles would be "sorted" by your rules. Remember too, that we'd need an article labelled with general_concepts, and decoration. Imagine, now python doc (that apparently is very different from USENET posts)(ref.tex): ... It is also possible to create anonymous functions (functions not bound to a name), for immediate use in expressions. This uses lambda forms, described in section \ref{lambda}. Note that the lambda form is merely a shorthand for a simplified function definition; a function defined in a ``\keyword{def}'' statement can be passed around or assigned to another name just like a function defined by a lambda form. The ``\keyword{def}'' form is actually more powerful since it allows the execution of multiple statements. \indexii{lambda}{form} ... Imagine that I attribute this with: \indexanonymous \indexlambda \indexdefinition \indexdef Imagine that I've attributed this too(tut.tex): expression. Semantically, they are just syntactic sugar for a normal function definition. Like nested function definitions, lambda forms cannot reference variables from the containing scope, but this can be overcome through the judicious use of default argument values, e.g. \begin{verbatim} def make_incrementor(n): return lambda x, incr=n: x+incr \end{verbatim} \indexlambda \indexexample \indexvarscope Well, simply with these attributions we can have ALL this combinations: - If we want to search by lambda, we'd have: lambda example varscope anonymous lambda definition def So the reader would guess: ah an example of lambda, and the definition of lambda! - If we want to search by def, we'd have: ... anonymous lambda definition def .... ... So, he'd know all the ways of defining functions, def, lambda, recursive, class messages... So the information we wrote for 'lambda' would be reused for the people who wants information about defining information. - If we want info about var scopes , then we have: ..... lambda example varscope ..... Just imagine all the stuff of scopes here. You can see easily that all the existing information could be put from dozens different points of views, and that examples, USENET, definitions, tutorials may cooperate to give a global complete information about any subject. Now, let's think about XML, it basically reflects the structure of information. Now I wonder if it'd be better to know what information are we talking about, if we had 10000 chunks of attributed information ( whose value is great ), we could know which are the possible structures (combinations of those chunks). Sorry, if you expected a description of the algorithm (analizatoo.py), quite "irrelevant" I think, here what really matters is information. Anyway if you want further information or translation , just say it. I can't think of any other easier, faster and more powerful way for reusing existing "as is" information, as this. Imagine a very large set of chunks of information, sharing and grouping ... Is that XML? Is better? Is XML a subset, a hard-wire of certain scheme of some groups of chunks of information? Regards/Saludos Manolo ------------- My addresses / mis direcciones: a="www.ctv.es/USERS/irmina" b=[("Lritaunas Peki Project", ""), ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ), ("page of drawing utility for tex ", "/texpython.htm" ), ("CrossWordsLand","/cruo/cruo.html") ] for i in b: print i[0],":", a+i[1] Reality is nothing but a collective hunch. -- Lily Tomlin From david@hotjobs2000.com Tue Nov 9 18:54:18 1999 From: david@hotjobs2000.com (David Winsen) Date: Tue, 9 Nov 1999 10:54:18 -0800 Subject: [Doc-SIG] WEB DATABASE PROGRAMMER POSITION AVAILABLE Message-ID: <000801bf2ae3$d455d180$2a2565d8@pacbell.net> This is a multi-part message in MIME format. ------=_NextPart_000_0005_01BF2AA0.C56185E0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable =20 WEB DATABASE PROGRAMMER POSITION AVAILABLE =20 URGENT MESSAGE! =20 This e-mail is not intended to be un-solicited. We apologize if you = didn't want to receive this e-mail. Please reply to be removed.=20 =20 From: David Winsen - Senior Consultant - High Technology Executive = Search=20 =20 We have an out-dated copy of your resume in our database or have viewed = your credentials on the internet. HTES is an established national = Executive Search and Consulting Firm who has been serving the High Tech = Industries for over 25 years. =20 =20 We have been confidentially retained by a Los Angeles, Ca. based Adult = Internet Fulfillment/Billing Company. Salary is $50k-$100k DOE.=20 We are confidentially pre-screening top candidates for the following = position: Web Database Programmer Description =20 =20 Candidates will need to be extremely detail oriented and have a = solid work ethic. Will be developing cutting edge software and = e-commerce applications. They are an established and ever growing = Internet Fulfillment/Billing Company. They offer a casual & unique work = environment unlike any other, full benefits, & room for growth and = advancement. =20 =20 =20 =20 Requirements =20 =20 Candidates will need experience in Perl, Python, PHP, Javascript, = UNIX, Linux or FreeBSD, MySQL a plus. BS Computer Science or equivalent. = At least 3 years of Web experience is a plus. =20 =20 If you are interested, please E-mail me in MS Word 95-98 a recent copy = of your resume and a cover letter with your specific information, = including your recent compensation package to Position-for: Web Database = Programmer =20 My personal E-mail is david@hotjobs2000.com or fax your resume to (310) = 855-0840. If you have any questions about the position(s), please call = me at (310) 855-0406 and I will discuss them in detail. =20 We also have developed an interactive Website that you can view over = 6000 national openings www.hotjobs2000.com. This system is effective, = easy to use and new positions are posted daily. We encourage you to use = it and nominate yourself for other positions you feel you are qualified = for. We are looking forward to working with you now and in the future.=20 =20 ------=_NextPart_000_0005_01BF2AA0.C56185E0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
 

WEB DATABASE PROGRAMMER POSITION AVAILABLE

 

URGENT=20 MESSAGE!

 

This e-mail is not intended to be = un-solicited. We=20 apologize if you didn't want to receive this e-mail. Please reply to be = removed.=20

 

From:  David Winsen - Senior = Consultant - High=20 Technology Executive Search

 

We=20 have an out-dated copy of your resume in our database or have viewed = your=20 credentials on the internet.  = HTES=20 is an established national Executive Search and Consulting Firm who has = been=20 serving the High Tech Industries for over 25 years. 

 

We=20 have been confidentially retained by a Los Angeles, Ca. based Adult = Internet=20 Fulfillment/Billing Company. Salary is $50k-$100k DOE. =

We=20 are confidentially pre-screening top candidates for the following = position: Web Database = Programmer

 

 

Description

Candidates=20 will need to be extremely detail oriented and have a solid work = ethic.=20 Will be developing cutting edge software and e-commerce = applications. They=20 are an established and ever growing Internet Fulfillment/Billing = Company.=20 They offer a casual & unique work environment unlike any = other, full=20 benefits, & room for growth and advancement.

Requirements

Candidates=20 will need experience in Perl, Python, PHP, Javascript, UNIX, Linux = or=20 FreeBSD, MySQL a plus. BS Computer Science or equivalent. At least = 3 years=20 of Web experience is a plus.

 

If=20 you are interested, please E-mail me in MS Word 95-98 a recent copy of = your=20 resume and a cover letter with your specific information, including your = recent=20 compensation package to Position-for:=20 Web Database Programmer

 

My=20 personal E-mail is david@hotjobs2000.com=20 or fax your resume to (310) 855-0840. If you have any questions about = the=20 position(s), please call me at (310) 855-0406 and I will discuss them in = detail.

 

We=20 also have developed an interactive Website that you can view over 6000 = national=20 openings www.hotjobs2000.com.  This system is effective, easy = to use=20 and new positions are posted daily. =20 We encourage you to use it and nominate yourself for other = positions you=20 feel you are qualified for.  = We are=20 looking forward to working with you now and in the future.

 

------=_NextPart_000_0005_01BF2AA0.C56185E0-- From S.I.Reynolds@cs.bham.ac.uk Wed Nov 10 18:05:50 1999 From: S.I.Reynolds@cs.bham.ac.uk (Stuart Reynolds) Date: Wed, 10 Nov 1999 18:05:50 +0000 Subject: [Doc-SIG] PythonDoc - how to run Message-ID: <3829B3FE.7A0A@cs.bham.ac.uk> Hi, I've just installed PythonDoc on my system hoping to use it produce documentation for one of our projects. I'm having a bit of trouble getting it to output anything: % ls MDP.py MDP.pyc % pythondoc MDP.py Error: Couldn't import MDP (exceptions.ImportError: No module named MDP) Same result with, % pythondoc -d ../docs MDP.py % pythondoc -d ../docs -i MDP.py % pythondoc -d ../docs -i -v MDP.py % pythondoc -d ../docs -i -s ../docs MDP.py While from the directory above, % pythondoc reps/MDP.py % pythondoc -d docs reps/MDP.py produces nothing (no output and no errors). Any ideas? Or have I misread the README file? Cheers Stuart PS. This is under Python 1.5.2 on Solaris. From S.I.Reynolds@cs.bham.ac.uk Thu Nov 11 13:35:41 1999 From: S.I.Reynolds@cs.bham.ac.uk (Stuart Reynolds) Date: Thu, 11 Nov 1999 13:35:41 +0000 Subject: [Doc-SIG] PythonDoc - how to run References: <3829B3FE.7A0A@cs.bham.ac.uk> Message-ID: <382AC62D.1720@cs.bham.ac.uk> This is a multi-part message in MIME format. --------------7BC813E56 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Edward Welbourne wrote: > > > % pythondoc MDP.py > > Error: Couldn't import MDP (exceptions.ImportError: No module named MDP) > > Hm. Try adjusting your PYTHONPATH environment variable Ha! Well spotted. It now has '.' in (I'd removed it by mistake). Ok that's fixed the first problem but pythondoc still produces no documents. [12:56]~/toolkit >echo $PYTHONPATH /:/home/pg/sir/toolkit/:. [12:56]~/toolkit >cd reps [12:56]~/toolkit/reps >python Python 1.5.2 (#1, Apr 20 1999, 19:24:22) [GCC egcs-2.91.57 19980901 (egcs-1.1 re on sunos5 Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> import MDP >>> import sys >>> sys.path ['', '/', '/home/pg/sir/toolkit/', '.', '/bham/ums/common/pd/packages/Python/lib/python1.5/', '/bham/ums/common/pd/packages/Python/lib/python1.5/plat-sunos5', '/bham/ums/common/pd/packages/Python/lib/python1.5/lib-tk', '/bham/ums/solaris/pd/bin/../packages/Python-1.5.2/lib/python1.5/lib-dynload', '/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages', '/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/numeric', '/bham/ums/solaris/pd/bin/../packages/Python-1.5.2/lib/python1.5/site-packages', '/bham/ums/solaris/pd/bin/../packages/Python-1.5.2/lib/python1.5/site-packages/numeric'] >>> ^D [12:58]~/toolkit/reps >pythondoc MDP.py [12:58]~/toolkit/reps >pythondoc -d ./ MDP.py [12:58]~/toolkit/reps >ls #MDP.py# MDP.py MDP.pyc __init__.py __init__.pyc [12:58]~/toolkit/reps >pythondoc -d ./ -s ./ MDP.py #MDP.py# MDP.dtr MDP.py MDP.pyc __init__.py __init__.pyc Note that I can output the doctree I've also just tried running pythondoc on the test file included in the distribution. This also produces no output. Cheers Stuart --------------7BC813E56 Content-Type: text/plain; charset=us-ascii; name="out.txt" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="out.txt" Error: Couldn't import test.test_al (exceptions.ImportError: No module named al) math module, testing with eps 1e-05 constants acos asin atan atan2 ceil cos cosh exp fabs floor fmod frexp hypot ldexp log log10 modf pow sin sinh sqrt tan tanh test Warning: can't open /bham/ums/common/pd/packages/Python/lib/python1.5/test/output/test 1 test OK. 10 times sub 1.800 CPU seconds 10 times split 1.990 CPU seconds 10 times findall 2.010 CPU seconds From: bwarsaw@cnri.reston.va.us Date: Mon Feb 12 17:21:48 EST 1996 To: kss-submit@cnri.reston.va.us MIME-Version: 1.0 Content-Type: multipart/knowbot; boundary="801spam999"; version="0.1" This is a multi-part message in MIME format. --801spam999 Content-Type: multipart/knowbot-metadata; boundary="802spam999" --802spam999 Content-Type: message/rfc822 KP-Metadata-Type: simple KP-Access: read-only KPMD-Interpreter: python KPMD-Interpreter-Version: 1.3 KPMD-Owner-Name: Barry Warsaw KPMD-Owner-Rendezvous: bwarsaw@cnri.reston.va.us KPMD-Home-KSS: kss.cnri.reston.va.us KPMD-Identifier: hdl://cnri.kss/my_first_knowbot KPMD-Launch-Date: Mon Feb 12 16:39:03 EST 1996 --802spam999 Content-Type: text/isl KP-Metadata-Type: complex KP-Metadata-Key: connection KP-Access: read-only KP-Connection-Description: Barry's Big Bass Business KP-Connection-Id: B4 KP-Connection-Direction: client INTERFACE Seller-1; TYPE Seller = OBJECT DOCUMENTATION "A simple Seller interface to test ILU" METHODS price():INTEGER, END; --802spam999 Content-Type: message/external-body; access-type="URL"; URL="hdl://cnri.kss/generic-knowbot" Content-Type: text/isl KP-Metadata-Type: complex KP-Metadata-Key: generic-interface KP-Access: read-only KP-Connection-Description: Generic Interface for All Knowbots KP-Connection-Id: generic-kp KP-Connection-Direction: client --802spam999-- --801spam999 Content-Type: multipart/knowbot-code; boundary="803spam999" --803spam999 Content-Type: text/plain KP-Module-Name: BuyerKP class Buyer: def __setup__(self, maxprice): self._maxprice = maxprice def __main__(self, kos): """Entry point upon arrival at a new KOS.""" broker = kos.broker() # B4 == Barry's Big Bass Business :-) seller = broker.lookup('Seller_1.Seller', 'B4') if seller: price = seller.price() print 'Seller wants $', price, '... ' if price > self._maxprice: print 'too much!' else: print "I'll take it!" else: print 'no seller found here' --803spam999-- --801spam999 Content-Type: multipart/knowbot-state; boundary="804spam999" KP-Main-Module: main --804spam999 Content-Type: text/plain KP-Module-Name: main # instantiate a buyer instance and put it in a magic place for the KOS # to find. __kp__ = Buyer() __kp__.__setup__(500) --804spam999-- --801spam999-- Traceback (innermost last): File "/bham/ums/common/pd/bin/pythondoc", line 4, in ? pythondoc.pythondoc.generate_pages(modules, formats) File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/pythondoc.py", line 256, in generate_pages docobject = docobjects.create_docobject(object) File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 479, in create_docobject object = _class_map[type(pyobject)](pyobject) #, name) File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 211, in __init__ Composite.__init__(self, object, name) File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 154, in __init__ Object.__init__(self, object, name) File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 40, in __init__ self.subobjects() File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 95, in subobjects self.__subobjects = self.get_subobjects() File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 169, in get_subobjects items = self.get_allobjects() File "/bham/ums/common/pd/packages/Python/lib/python1.5/site-packages/pythondoc/docobjects.py", line 243, in get_allobjects if module.__name__ != modulename: AttributeError: 'None' object has no attribute '__name__' --------------7BC813E56-- From fdrake@acm.org Thu Nov 11 17:00:45 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 11 Nov 1999 12:00:45 -0500 (EST) Subject: [Doc-SIG] Approaches to structuring module documentation Message-ID: <14378.63037.571200.652453@weyr.cnri.reston.va.us> --1wGfXqjHrK Content-Type: text/plain; charset=us-ascii Content-Description: message body text Content-Transfer-Encoding: 7bit Well, now that things have quieted down a little (where?!), I'll stir things up a little. Two broad approaches to structuring the documentation have been presented: One is the current document-centric model, where there are a number of books/manuals/whatever that contain interesting information, but need to be used as really large chunks. Extracting specific information is (appearantly) difficult for humans (witness the recent request for a random() function on the newsgroup by someone who said they looked in the index; just the wrong one); it's much worse for applications. The other approach, first proposed by Sean McGrath, is to use a "microdocument" architecture, where each module is represented in a separate structured document that is designed specifically to handle that kind of information. First, I'll define some terms and comment on both approaches. Terms ----- DOCUMENT-ORIENTED CONTENT: Documents which are structured similarly to the traditional presentation form; document-oriented DTDs feature things like chapters, sections, titles, articles, etc. This is what David Megginson called "book" DTDs in "Structuring XML Documents." DOCUMENT-CENTRIC APPROACH: The human-read document is the primary way to encode information, including module reference material. A "monumental" DTD would dedscribe the document structure. Supplemental data files could be used for highly specialized information; these could use alternate DTDs. MICRODOCUMENT APPROACH: Multiple DTDs are used to encode document-level information and module reference material. Let's only consider the case of one DTD to handle module reference material, and a small number (1 or 2) of document-oriented DTDs; possibly one for "sections" and one that could be used to compose sections and module references into chapters and manuals. Document-centric Approach ------------------------- This approach has the advantage of matching the current structure of the documentation. The conversion isn't terribly difficult or even time consuming given the state of the things in Doc/tools/sgmlconv/ in the CVS repository. There's clearly some work to do regarding DTD specification and probably a bit of transformation, but a large part of the coding and testing is done. The existing documents are tolerably organized for direct human use, and incremental updates to the documents seem to work well. Documenting a module using the document-centric approach requires little effort due to the simplicity of the existing markup, but it's not always clear what things "go together." This problem can be at least partly solved by evolving the markup to support additional forms of linkages between information chunks, and keeping the processing tools up to date with the markup changes. This can be done before or after a conversion to XML as it is largely orthagonal to syntax. Microdocument Approach ---------------------- Using a separate DTD to document modules offers advantages when it comes time to extract information programmatically. Creating skeleton module references from the current documentation would be harder and would certainly require more code to be written, but the payoffs are potentially very high. To really make it work, a lot of attention would have to be applied to the result of the first-stage conversion to check the accuracy of the results, make the various bits of text actually land in the right place (since everything is pretty much thrown together now), and encode a lot of additional information about types, parameters, exceptions thrown, etc. On the other hand, getting this information into the documents in the document-centric approach also requires a lot of this work. An IDE could use the content provided by the module references very effectively to provide help and smart name completion. For performance, the documentation would probably be loaded into some sort of database so chunks of information could be retrieved very quickly, and probably in some pre-digested form. Inheritance diagrams can be generated, and protocols/interfaces can be documented much more clearly. The most significant drawback I can see is that the markup can very easily become quite heavy, but this isn't unusual when there's a lot of structured information to present. Comparison ---------- A wide variation in module documentation styles is possible using the document-centric approach. While most of the modules in the Library Reference are presented in a fairly formulaic way, some are not. Note the chapters on the debugger and profiler, which really don't use the styles used elsewhere in the Library Reference. I'm not sure if allowing this level of flexibility is good or bad; I could make the case for both. I can also see where allowing both could be a good idea, but it may be reasonable to require a "standard" structure for module documentation, regardless of the approach taken on the whole, and then allow additional material to be provided using document-oriented content. At any rate, last night I sat down with one module and the existing documentation for it, and marked up a module reference for it using the microdocument approach. The markup is quite heavy compared to the current LaTeX file: weyr(.../Doc/lib); wc libmailbox.tex mailbox.xml 53 251 1938 libmailbox.tex 159 504 5364 mailbox.xml 212 755 7302 total That's a 200% increase in line count and a 150% increase in file size. The later isn't much of an issue, but the former is because it seriously impacts readability. This explosion of markup is of most concern for authors; a lot of markup is required to encode enough information to justify changing the approach. As more markup is required, it is increasingly difficult to get contributions because it takes the authors more time to document their work. I'd like to maintain Python's standing as the best-documented free scripting language, and I'm not sure authors will be willing to use the more extensive markup. I'd also need a small (large?) army of volunteers to help convert the generated skeleton module references to take advantage of the ability to encode far more detail about modules than is currently available. Are there enough people sufficiently interested? Doing this one myself would require someone directly supporting the work; the occaissional evening would not get it done. A Hybrid Approach ----------------- A hybrid approach could be taken in which the architecture is that of the microdocument approach, but we support something similar to the current (document-centric) approach for the document-oriented content components. This would allow a slower migration and facilities such as the debugger could be documented using the document structure rather than the module structure. The payoffs for application of the documentation are approximately the same as for the strict microdocument approach. The most significant change is probably that some modules (those documented only in document-oriented components) may not be described in the help system, or at least not fully described. The issues of conversion are largely the same as for the microdocument architecture since most modules would be documented in that way. The document-oriented DTD(s) may be a little different, but that's the only substantial techical difference I see in getting it done. Status ------ I haven't ventured to write a DTD yet for either approach; there's still a lot to decide before that gets done. I also don't want to write a bunch of DTDs that aren't going to be used! I think we do need to consider the two approaches in the immediate future. Dealing with the legacy conversion software is tolerable for now, but it's getting worse over time. Rich linking is difficult in the HTML output, which seems to be the most-used format, but I think that's something that a lot of people would like to see. If we elect to go with the document-centric approach, there's a bit of DTD design to do, and a bit of tweaking in the conversion tools, but we're a long way there. Adopting the microdocument approach offers the advantages of a very high long-term payoff, which is appealing, but please consider my comments and pleas above carefully. The hybrid approach can be considered as roughly the same as the microdocument approach, as discussed above. Comments? -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives --1wGfXqjHrK Content-Type: text/xml; charset=iso-8859-1 Content-Description: Sample module reference. Content-Disposition: inline; filename="mailbox.xml" Content-Transfer-Encoding: 7bit mailbox Read various mailbox formats. This module defines a number of classes that allow easy and uniform access to mail messages in a mailbox. Most of the supported mailbox formats come from the Unix world. None of the classes defined in this module lock the mailboxes that are accessed; this needs to be handled by application code. Mailbox The next message in the mailbox. The message's fp will be a file object, but not a real file object. If no messages have been read, this will be the first message. If all messages have been read, None will be returned. UnixMailbox Mailbox Access a classic Unix-style mailbox, where all messages are contained in a single file and separated by From name time lines. The file object fp points to the mailbox file. Initialize the mailbox object and point to the first message in the mailbox. MmdfMailbox Mailbox Access an MMDF-style mailbox, where all messages are contained in a single file and separated by lines consisting of four control-A characters. The file object fp points to the mailbox file. Initialize the mailbox object and point to the first message in the mailbox. MHMailbox Mailbox Access an MH mailbox, a directory with each message in a separate file with a numeric name. Messages that are added to the mailbox after the instance is created are not accessible; a new instance is needed to access newly added messages. The name of the mailbox directory. Initialize the list of messages that can be loaded from the mailbox. Maildir Mailbox Access a Qmail mail directory. All new and current mail for the mailbox is made available. Messages that are added to the mailbox after the instance is created are not accessible; a new instance is needed to access newly added messages. The name of the mailbox directory. The dirname parameter points to the mailbox directory. BabylMailbox Mailbox Access a Babyl mailbox, which is similar to an MMDF mailbox. Mail messages start with a line containing only '*** EOOH ***' and end with a line containing only '\037\014'. A file object fp that points to the mailbox file. Initialize the mailbox object and point to the first message in the mailbox. --1wGfXqjHrK-- From Moshe Zadka Fri Nov 12 07:29:09 1999 From: Moshe Zadka (Moshe Zadka) Date: Fri, 12 Nov 1999 09:29:09 +0200 (IST) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: <14378.63037.571200.652453@weyr.cnri.reston.va.us> Message-ID: On Thu, 11 Nov 1999, Fred L. Drake, Jr. wrote: > Well, now that things have quieted down a little (where?!), I'll > stir things up a little. Very good. > MICRODOCUMENT APPROACH: Multiple DTDs are used to encode > document-level information and module reference material. Let's only > consider the case of one DTD to handle module reference material, and > a small number (1 or 2) of document-oriented DTDs; possibly one for > "sections" and one that could be used to compose sections and module > references into chapters and manuals. Well, no one who has read other mails of mine here will be surprised at my whole-hearted embracing of this approach. > I'm not > sure if allowing this level of flexibility is good or bad; I could > make the case for both. Here's a simple argument against it: in the TODO list, there are requests for explanations of how to use both the profiler and the debugger. Guido marked it "a library chapter isn't enough?". And he's right, but having the structure so flexible tempted guido to put the chapters in the library reference, instead as seperate documents. > That's a 200% increase in line count and a 150% increase in file > size. The later isn't much of an issue, but the former is because it > seriously impacts readability. Ummmm...it really depends on how much semantic information you put in. Here's the strongest argument for the microdocument approach: as someone who uses both Perl and Python (though I much prefer the later), I see the enormous benefit of a program like perldoc, which could only be written on a microdocument based infrastructure. For those not familiar with this program, I will paint a rosy picture of how beautiful the future could be if we used microdocuments, and written a "pydoc" preprocessor: pydoc htmllib --> show the documentation of htmllib pydoc string.reverse --> show the documentation of string.reverse pydoc -q reverse --> show all FAQs which have the word reverse in them pydoc -f reverse --> search for a function called "reverse" . . . (For those of you on WIMP interfaces, substitute a dialog, and a fancy window which formats the documents) We might need a PyML such that XMLXPyML such that XPyML is an application of XML. That way we could have whatever terseness from SGML we care to implement, and all the power of XML at our back. -- Moshe Zadka . INTERNET: Learn what you know. Share what you don't. From Manuel Gutierrez Algaba Fri Nov 12 17:12:17 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Fri, 12 Nov 1999 17:12:17 +0000 (GMT) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: <14378.63037.571200.652453@weyr.cnri.reston.va.us> Message-ID: A heavily technical document!! Far from my usual "way of thinking". On Thu, 11 Nov 1999, Fred L. Drake, Jr. wrote: > > DOCUMENT-ORIENTED CONTENT: Documents which are structured similarly Is this the LaTeX one ? or the "traditional" XML ? > > DOCUMENT-CENTRIC APPROACH: The human-read document is the primary Is this TeEncontreX'es ? Are "module reference material" the "\indexpython" things ? > MICRODOCUMENT APPROACH: Multiple DTDs are used to encode > document-level information and module reference material. Let's only What's this ? > > Document-centric Approach > ------------------------- > > Microdocument Approach > ---------------------- > Using a separate DTD to document modules offers advantages when it > comes time to extract information programmatically. Creating skeleton > module references from the current documentation would be harder and > would certainly require more code to be written, but the payoffs are > potentially very high. To put it short: "Lot of work coding _details_". Just a comment, python is **much** better than C++, for example, because you have no need to declare every type, every detail, even, you can have large parts of a python programm broken, parts that a C++ compiler would mark as erroneous. > To really make it work, a lot of attention > would have to be applied to the result of the first-stage conversion > to check the accuracy of the results, make the various bits of text > actually land in the right place (since everything is pretty much > thrown together now), and encode a lot of additional information about > types, parameters, exceptions thrown, etc. More heavy work ! > On the other hand, getting > this information into the documents in the document-centric approach > also requires a lot of this work. But perhaps, in a more free way. Let's say that document-centric ( at least TeEncontreX) seems more robust for mark-up. Anyone may mark wrong, but you can readjust or define similarities among different markings. > Comparison > ---------- > That's a 200% increase in line count and a 150% increase in file > size. The later isn't much of an issue, but the former is because it > seriously impacts readability. > This explosion of markup is of most concern for authors; a lot of > markup is required to encode enough information to justify changing > the approach. As more markup is required, it is increasingly > difficult to get contributions because it takes the authors more time > to document their work. The biggest problem I see here is that you get a very good documentation ( due to the huge ammount of work) or you get nothing ( the author doesn't documentate). It'd be wise to provide several levels of marking-up , so people can mark-up little by little, some important things first and so... This is the "TeEncontreX" version of Mailbox, this should work if you have AnalizaToo.py: \newcommand{\indexmoduleinbox}{\index{module}\index{mail}\index{inbox}} \newcommand{\indexshortdescription}{\index{description}} \newcommand{\indexdescription}{\index{description}} \newcommand{\indexreturnvalue}{\index{returnvalue}\index{protocol}} \newcommand{\indexclassdefinition}{\index{classdefinition}\index{protocol}} \newcommand{\indexMmdfMailbox}{\index{MmdfMailbox}\index{MDMF}} \newcommand{\indexMHMailbox}{\index{MmdfMailbox}\index{MH}} \newcommand{\indexMailDir}{\index{Mail}\index{dir}} \newcommand{\indexBabylMailbox}{\index{Babyl}\index{\indexBabylMailbox}} \newcommand{\indexMMDF}{\index{MMDF}} \jiji mailbox Read various mailbox formats. \indexmoduleinbox \indexname \indexshortdescription \jiji This module defines a number of classes that allow easy and uniform access to mail messages in a mailbox. Most of the supported mailbox formats come from the Unix world. None of the classes defined in this module lock the mailboxes that are accessed; this needs to be handled by application code. \indexdescription \indexmoduleinbox \jiji The next message in the mailbox. The message's ("rfc822.Message") fp will be a file object, but not a real file object. If no messages have been read, this will be the first message. If all messages have been read, None will be returned. \indexmoduleinbox \indexreturnvalue \indexrfc822 \jiji UnixMailbox Access a classic Unix-style mailbox, where all messages are contained in a single file and separated by "From name time lines". The file object fp points to the mailbox file. Initialize the mailbox object and point to the first message in the mailbox. \indexmoduleinbox \indexUnixMailBox \indexclassdefinition \jiji MmdfMailbox Access an MMDF-style mailbox, where all messages are contained in a single file and separated by lines consisting of four control-A characters. The file object fp points to the mailbox file. Initialize the mailbox object and point to the first message in the mailbox. \indexmoduleinbox \indexMmdfMailbox \indexclassdefinition \jiji Access an MH mailbox, a directory with each message in a separate file with a numeric name. Messages that are added to the mailbox after the instance is created are not accessible; a new instance is needed to access newly added messages. \indexmoduleinbox \indexMHMailbox \indexclassdefinition \jiji Maildir Access a Qmail mail directory. All new and current mail for the mailbox is made available. Messages that are added to the mailbox after the instance is created are not accessible; a new instance is needed to access newly added messages. The name of the mailbox directory. Initialize the list of messages that can be loaded from the mailbox. The dirname parameter points to the mailbox directory. \indexmoduleinbox \indexMaildir \indexclassdefinition \jiji BabylMailbox Access a Babyl mailbox, which is similar to an MMDF mailbox. Mail messages start with a line containing only '*** EOOH ***' and end with a line containing only '\037\014'. A file object fp that points to the mailbox file. Initialize the mailbox object and point to the first message in the mailbox. \indexmoduleinbox \indexBabylMailbox \indexclassdefinition \indexmmdf \jiji Just some comments: - Thinking about it, I mentioned the need for an appropos utility one year ago, If you realise, this IS the apropos utility!! - If one name should bear TeEncontreX it'd be : pico-documentation. Every chunk of info, delimited by \jiji, is independent of the rest of the universe and you can have them in different files. The only thing links to the world are the \newcommand definitions. Every chunk of info is very, very small, although it could be very big. You have absolute freedom in size, and you can refine the info as much (as less as you want). (VERY IMPORTANT POINT): - Because of the tiny size of every chunk you can analize typical chunks to interpolate "obvious" marking: \jiji BabylMailbox ( this would be marked as name) (this would be marked as description) Access a Babyl mailbox, which is similar to an MMDF mailbox. Mail messages start with a line containing only '*** EOOH ***' and end with a line containing only '\037\014'. (these as params) A file object fp that points to the mailbox file. Initialize the mailbox object and point to the first message in the mailbox. ... Or you can simply use "positional" marking into these very small chunks. You can have a bunch of small python programms for intelligent analysis of typical chunks, because they have little, I guess they'd be easy. ( VERY IMPORTANT POINT): - As they're very small you can include in docstrings, or simply as comments ( everywhere ). (DEFINITE POINT): - Using a mix of chunks of data and low-intelligent python script for deciding on (structure, position, "hidden marks"...) you can create XML code. So that XML can be considered as a low level form of chunk of infos. (Object Oriented point): - Chunks are nothing less than objects with info and processes related to them. Do you like objects ? or do you like the Pascal-like syntax of XML? Down to XML!!! Regards/Saludos Manolo ------------- My addresses / mis direcciones: a="www.ctv.es/USERS/irmina" b=[("Lritaunas Peki Project", ""), ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ), ("page of drawing utility for tex ", "/texpython.htm" ), ("CrossWordsLand","/cruo/cruo.html") ] for i in b: print i[0],":", a+i[1] You have to run as fast as you can just to stay where you are. If you want to get anywhere, you'll have to run much faster. -- Lewis Carroll From fdrake@acm.org Fri Nov 12 18:07:57 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 12 Nov 1999 13:07:57 -0500 (EST) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: References: <14378.63037.571200.652453@weyr.cnri.reston.va.us> Message-ID: <14380.22397.664057.212083@weyr.cnri.reston.va.us> Moshe Zadka writes: [re: formulaic v. more flexible module references] > Here's a simple argument against it: in the TODO list, there are requests > for explanations of how to use both the profiler and the > debugger. Guido marked it "a library chapter isn't enough?". And he's > right, but having the structure so flexible tempted guido to put the > chapters in the library reference, instead as seperate documents. I'm not entirely sure what you're against: allowing general document structure in the library reference manual, or allowing less structured content to serve as module references. But I understand the problem you're referring to. My thought is that the profiler and debugger both need to be documented in two ways: as modules (they are, and their interfaces are directly useful), and as user-support facilities (with more narrative documentation). The current problem is that these components are conflated. The narrative "how-to-use-it" documentation should be removed from the Library Reference and made part of a the User's Manual, which simply hasn't been written (yet -- any takers?). > > That's a 200% increase in line count and a 150% increase in file > > size. The later isn't much of an issue, but the former is because it > > seriously impacts readability. > > Ummmm...it really depends on how much semantic information you put in. Yes, but there's a good bit I expect to be present regardless. I think there's a lot to be gained by being able to say "this method expects a pathname, an optional string, and an optional integer, and returns a file-like object." Saying it in natural language is easy (if tedious, given the number of functions/methods about which we can give that level of information), but saying it so tools can handle it... requires a lot of markup. ;-) We'll need to define a "vocabulary" that can encompass built-in types, "protocols" (or interfaces, or whatever they can be called), and actuall classes. Class names are easy, but protocols and built-in types need to be added. Variations include being able to say "exactly this" or "this or a subclass," etc. It probably makes sense to be able to say "non-complex numbers," or "standard number types," or "non-negative integers," etc. > Here's the strongest argument for the microdocument approach: as someone > who uses both Perl and Python (though I much prefer the later), I see the > enormous benefit of a program like perldoc, which could only be written > on a microdocument based infrastructure. For those not familiar with this Actually, my inclination would to run "pydoc" off a back-end database rather than directly off the XML. The database could be built once from the document sources and then contain data that's been as pre-digested as makes sense. That would be a lot faster than using XML; the entries could be pickled objects or whatever makes sense. > We might need a PyML such that XML application of XML, only of SGML) and a convertor PyML->XPyML such that > XPyML is an application of XML. That way we could have whatever terseness > from SGML we care to implement, and all the power of XML at our back. I think we can avoid this. I'd opt simply to use XML and be done with it before using SGML only as a way to author XML. If anyone really wants to do this, the tools are easy enough to build given ESIS-generating SGML & XML parsers & the tools in Doc/tools/sgmlconv/. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From Moshe Zadka Fri Nov 12 20:03:59 1999 From: Moshe Zadka (Moshe Zadka) Date: Fri, 12 Nov 1999 22:03:59 +0200 (IST) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: <14380.22397.664057.212083@weyr.cnri.reston.va.us> Message-ID: On Fri, 12 Nov 1999, Fred L. Drake, Jr. wrote: > I'm not entirely sure what you're against: allowing general document > structure in the library reference manual, or allowing less structured > content to serve as module references. But I understand the problem > you're referring to. The former. The later would be classified in the "documentation is like sex: when it is good it is very good, and when it is bad, it is better then nothing" dept. I.e., it is better to have as much semantic information, but even without it, the docs are useful. > My thought is that the profiler and debugger both need to be > documented in two ways: as modules >and as user-support facilities I agree. In fact, the documentation for pdb is horrible as module documentation. It is quite good as a user's manual. > The narrative "how-to-use-it" documentation should be > removed from the Library Reference and made part of a the User's > Manual, which simply hasn't been written (yet -- any takers?). Hmmmmm....define User's Manual. What do you want from it? > Yes, but there's a good bit I expect to be present regardless. > I think there's a lot to be gained by being able to say "this method > expects a pathname, an optional string, and an optional integer, and > returns a file-like object." Saying it in natural language is easy > (if tedious, given the number of functions/methods about which we can > give that level of information), but saying it so tools can handle > it... requires a lot of markup. ;-) Again, it is a "real" problem, not an artifact of the solution: either you have AI, or you patiently tell the computer what every word means, or you live in a non-perfect world. Most solutions are a combination of all three approaches: use a bit of smart in the processor, put some markup, and live with the fact that some information will require a human to discover ;-) > Actually, my inclination would to run "pydoc" off a back-end > database rather than directly off the XML. The database could be > built once from the document sources and then contain data that's been > as pre-digested as makes sense. That would be a lot faster than using > XML; the entries could be pickled objects or whatever makes sense. It doesn't matter: you'd still have to use the micro-document approach for this to work. I just painted a rosy picture of what it would buy you. > > We might need a PyML such that XML > application of XML, only of SGML) and a convertor PyML->XPyML such that > > XPyML is an application of XML. That way we could have whatever terseness > > from SGML we care to implement, and all the power of XML at our back. > > I think we can avoid this. I'd opt simply to use XML and be done > with it before using SGML only as a way to author XML. If anyone > really wants to do this, the tools are easy enough to build given > ESIS-generating SGML & XML parsers & the tools in > Doc/tools/sgmlconv/. Oh, I forgot to spec that XPyML is a proper subset of PyML, so you can author directly in that. PyML is just thin syntactic sugar, so you can use some SGML minimizations, which are useful in practice. We can choose just a few (e.g., I'm for . INTERNET: Learn what you know. Share what you don't. From fdrake@acm.org Fri Nov 12 21:01:25 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 12 Nov 1999 16:01:25 -0500 (EST) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: References: <14378.63037.571200.652453@weyr.cnri.reston.va.us> Message-ID: <14380.32805.469825.418097@weyr.cnri.reston.va.us> --FBmNV/Tzqn Content-Type: text/plain; charset=us-ascii Content-Description: message body text Content-Transfer-Encoding: 7bit Manuel Gutierrez Algaba writes: > Is this the LaTeX one ? or the "traditional" XML ? I would describe the current approach as document-centric. "Document-oriented" is how I was referring to content which was naturally organized in documents, as opposed to data-structure-like constructions such as my sample module reference. The actual syntax wasn't specific to any of the three definitions. > > DOCUMENT-CENTRIC APPROACH: The human-read document is the primary > > Is this TeEncontreX'es ? Are "module reference material" the > "\indexpython" things ? No, by this I meant the entire section documenting the module. > MICRODOCUMENT APPROACH: Multiple DTDs are used to encode > document-level information and module reference material. Let's only > > What's this ? I'm not sure what "this" refers to; the term "microdocument approach"? I'll be more specific: Using a microdocument approach would involve using at least 2 DTDs, one for module references, and another for "everything else." Each module reference would be a document instance all by itself (in the SGML/XML sense), not just a file that's part of something larger (like the current module sections; there's no meaningful way to process them individually. To get something like the current Library Reference, another document (with another DTD) would specify how to put it together: put this module, then this one, and now that section of prose; in the next chapter, put .... We could define separate DTDs to document Python modules, C APIs, and more book- or article-like sections. Another would be the "glue" that defines a "manual" or "howto" document. > From your explanations and looking at TeEncotreX, I'd describe what you're doing as "indexing": you're assigning terminology from a controlled vocabulary to each entry in your document base, and using that as a retrieval mechanism. I think this is orthagonal to what I'm talking about. Regardless of a move toward a microdocument approach or document-centric approach, good indexing is critical to make the information accessible. The way you're using it (with lots of small articles) makes it very microdocument-flavored, aside from lumping all the documents in one file. > To put it short: "Lot of work coding _details_". Just a comment, > python is **much** better than C++, for example, because you > have no need to declare every type, every detail, even, you can > have large parts of a python programm broken, parts that a C++ > compiler would mark as erroneous. I agree. I think things like type annotations should be completely optional in the documentation. However, I think there's a lot of value in supporting annotations that say things like "this returns a file-like object" that can be interpreted by programmer's tools (help system in an IDE, pylint-style analyzers, etc.). So it should be possible to add interesting annotations, so a programmer can ask a tool, "What are all the ways I can get a file object?" > > To really make it work, a lot of attention > > would have to be applied to the result of the first-stage conversion > > to check the accuracy of the results, make the various bits of text > > actually land in the right place (since everything is pretty much > > thrown together now), and encode a lot of additional information about > > types, parameters, exceptions thrown, etc. > > More heavy work ! But, as you point out for TeEncontreX, it's linear to the volume of information you have + what you want to get out of it. > The biggest problem I see here is that you get a very good documentation > ( due to the huge ammount of work) or you get nothing ( the author > doesn't documentate). We get the later one now! ;( > It'd be wise to provide several levels of marking-up , so people > can mark-up little by little, some important things first and so... This is another good reason to make a lot of the markup optional; my example probably did use "maximal" markup, but went a long way toward it. Let's try adjusting the assumed DTD a little, and cut out a fair bit of the markup (even if it's useful). The file is attached; here's the word count: weyr(.../Doc/lib); wc libmailbox.tex mailbox.xml mailbox-min.xml 53 251 1938 libmailbox.tex 159 504 5364 mailbox.xml 118 370 3936 mailbox-min.xml Still large, but definately better. Good enough? I don't know. I do expect that at least one tool will emerge that will take a Python source file and spit out a skeleton documentation file that can be filled in. > This is the "TeEncontreX" version of Mailbox, this should > work if you have AnalizaToo.py: Cool; I'll run this through as soon as your package downloads again! ;-) Aha! You didn't test this! ;-) > Just some comments: > - Thinking about it, I mentioned the need for an appropos utility > one year ago, If you realise, this IS the apropos utility!! Library science types would call this kind of data marking "indexing". Saludos, amigo! (Hey, I'm learning Spanish! Cool! ;) -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives --FBmNV/Tzqn Content-Type: text/xml; charset=iso-8859-1 Content-Description: More minimal sample module reference. Content-Disposition: inline; filename="mailbox-min.xml" Content-Transfer-Encoding: 7bit mailbox Read various mailbox formats. This module defines a number of classes that allow easy and uniform access to mail messages in a mailbox. Most of the supported mailbox formats come from the Unix world. None of the classes defined in this module lock the mailboxes that are accessed; this needs to be handled by application code. Mailbox A message object, or None if there aren't any more message in the mailbox. UnixMailbox Mailbox Access a classic Unix-style mailbox, where all messages are contained in a single file and separated by From name time lines. Initialize the mailbox object and point to the first message in the mailbox. MmdfMailbox Mailbox Access an MMDF-style mailbox, where all messages are contained in a single file and separated by lines consisting of four control-A characters. Initialize the mailbox object and point to the first message in the mailbox. MHMailbox Mailbox Access an MH mailbox, a directory with each message in a separate file with a numeric name. Messages that are added to the mailbox after the instance is created are not accessible; a new instance is needed to access newly added messages. Initialize the list of messages that can be loaded from the mailbox. Maildir Mailbox Access a Qmail mail directory. All new and current mail for the mailbox is made available. Messages that are added to the mailbox after the instance is created are not accessible; a new instance is needed to access newly added messages. The dirname parameter points to the mailbox directory. BabylMailbox Mailbox Access a Babyl mailbox, which is similar to an MMDF mailbox. Mail messages start with a line containing only '*** EOOH ***' and end with a line containing only '\037\014'. Initialize the mailbox object and point to the first message in the mailbox. --FBmNV/Tzqn-- From fdrake@acm.org Fri Nov 12 21:14:58 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Fri, 12 Nov 1999 16:14:58 -0500 (EST) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: References: <14380.22397.664057.212083@weyr.cnri.reston.va.us> Message-ID: <14380.33618.749278.280311@weyr.cnri.reston.va.us> Moshe Zadka writes: > Hmmmmm....define User's Manual. What do you want from it? How to work with the interpreter and related tools. It wouldn't teach the language, but would teach the environment and provide reference material for things like the user interface (for PythonWin or IDLE, or readline information for Unix). Debuggers and profilers generally fall into this category of information. > Again, it is a "real" problem, not an artifact of the solution: either you > have AI, or you patiently tell the computer what every word means, or you > live in a non-perfect world. Most solutions are a combination of all three > approaches: use a bit of smart in the processor, put some markup, and live > with the fact that some information will require a human to discover ;-) I agree. I think we have too little useful markup now. > It doesn't matter: you'd still have to use the micro-document approach > for this to work. I just painted a rosy picture of what it would buy you. No, you can still use a non-microdocument architecture. I failed to present the mega-database-dump model for a reason, though. ;-) It's entirely possible to use the sort of markup I presented in my sample module reference without using micro-documents. It just gets very painful. > But I can live with straight XML, if that's the party line. I don't think there's a "party line"; I just want to avoid introducing new dialects and processing stages. There's enough that really needs doing on the content side of things that we don't need to create new problems. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From Moshe Zadka Fri Nov 12 22:36:38 1999 From: Moshe Zadka (Moshe Zadka) Date: Sat, 13 Nov 1999 00:36:38 +0200 (IST) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: <14380.33618.749278.280311@weyr.cnri.reston.va.us> Message-ID: On Fri, 12 Nov 1999, Fred L. Drake, Jr. wrote: > Moshe Zadka writes: > > Hmmmmm....define User's Manual. What do you want from it? > > How to work with the interpreter and related tools. It wouldn't > teach the language, but would teach the environment and provide > reference material for things like the user interface (for PythonWin > or IDLE, or readline information for Unix). Debuggers and profilers > generally fall into this category of information. Uh, OK. Would the things that are currently in the Python manual be part of it? Obscure options like -X, -S or -i? > > But I can live with straight XML, if that's the party line. > > I don't think there's a "party line"; I just want to avoid > introducing new dialects and processing stages. There's enough that > really needs doing on the content side of things that we don't need to > create new problems. I thought we're trying to come up with a party line. Which would include, among other things, a markup language...I just remarked I'm not adamant on the SGML->XML stage, though I think it would improve the life of documentation writers. I think the looks should be always to Perland, in that respect. They managed to come up with a standard so good, *every* module from CPAN has a half-decent documentation at the least, which is very accessible. POD is very light on the eyes and easy to write, and I do believe XML+ a bit of SGML minimization could approach the ease, but I doubt XML alone could do it. Just consider reverse Or It all depends on how much of SGML we wish to take. Other alternatives include: reverse (That last one is my favourite because modifying current XML tools to deal with it seems relatively easy: an empty closing tag is mapped to the last tag on the stack) -- Moshe Zadka . INTERNET: Learn what you know. Share what you don't. From Manuel Gutierrez Algaba Sat Nov 13 01:11:18 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Sat, 13 Nov 1999 01:11:18 +0000 (GMT) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: <14380.32805.469825.418097@weyr.cnri.reston.va.us> Message-ID: On Fri, 12 Nov 1999, Fred L. Drake, Jr. wrote: > > Library science types would call this kind of data marking > "indexing". I do admit that At the beginning they were simple indices, now they still seem indexes. But you say that "good indexing" is crucial. If you say it's orthogonal, that means it's independent, so indexing could start today. But indexing means creating a vocabulary, a lexicon. This can be created on the fly at the same time the indexing goes on. But when you have such a lexicon, you would be tempted of using such lexicon or relationships amongs indixes in your XML marks. So, there's an implication indexes --> XML Second, if you have created HTML pages or LaTeX from TeEncontreX index system, then you'll have a kind of Python Documentation. If both micro-doc and document-centric have a strong implication in the generation of Python Doc, it's clear that TeEncontreX is not so orthogonal. You can say that it's a very weird thing, or not usual, but in the three fields: - info storage - info representation - lexicon definition TeEncontreX and XML have a good intersection of functionalities. And, it's not simple indexing, LaTeX indexing, for example, doesn't alter the structure of info. TeEncontreX ( it means Te Encontre --> I found you ) isolates the info to be indexed from the rest, so, if you take a Document-centric stuff and you apply TeEncontreX method in it, you don't have the same doc any longer, but the old doc, divided and attributed in 100 or 200 parts. It's like an XML, but whose structure is not based upon marks , it's based in the meanings of indexes and how are related to each other. Another comment: indexes usually are one-dimensional. If you have an item described by many indexes you have something multi-dimensional or an object. And finally, whatever decision is chosen. Let it be simple and natural, remember that not everybody can speak XML. Indexes may be help you in discovering the Lexicon of python, the Knowledge Zones of Python and with this you can decide the size and depth of any micro-doc or document-centric. So, I see that Indexes have an inmediate pay-back ( for the user) and they help further XML (and not XML ) design. Why not start with them ? Regards/Saludos Manolo ------------- My addresses / mis direcciones: a="www.ctv.es/USERS/irmina" b=[("Lritaunas Peki Project", ""), ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ), ("page of drawing utility for tex ", "/texpython.htm" ), ("CrossWordsLand","/cruo/cruo.html") ] for i in b: print i[0],":", a+i[1] Well, you know, no matter where you go, there you are. -- Buckaroo Banzai From Manuel Gutierrez Algaba Sat Nov 13 20:21:20 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Sat, 13 Nov 1999 20:21:20 +0000 (GMT) Subject: [Doc-SIG] More about indexing, Propaganda and Crazy Wishes Message-ID: I think it's important to recall that : "make life to the helpers makes many helpers to join" "the easier is to help, the more helpers will be" So here are more general points: The ideal world for the helper would be the occasional ( almost at random), anonymous and in his own language documentation. This is possible. The idea came to me , thinking about TeEncontreX, ( starting to be sick of that name ? :) ). If you have downloaded it, you'd realise that the amount of files ( up to 700 ) is huge... But as basically AnalizaToo.py creates a database, it'd be not difficult to use it as a CGI script, and create the files on the fly. But you can have the inverse, you can present a random piece of document ( ideally a robot would extract it at random from the doc, perhaps looking for unattributed pieces of info ) and then ask the user to attribute it, the way he wanted to. You can handle different languages markup. Thinking about Fred Drake , I realise that he really deserved an English version of AnalizaToo.py but I don't want to do low level file-editing, so I have done a script , available in ...well you know where. Its name is translations.py . This very small, trivial thing may be used with \indexl... stuff, ... BTW, I've released a new version of TeEncontreX : 1.1 . 30 % more of data. And it's me alone! If we can make things easier enough for the people to contribute we can make them contribute massively and have all python documented in a matter of weeks! Make it easy, and it'll be easy done. Regards/Saludos Manolo ------------- My addresses / mis direcciones: a="www.ctv.es/USERS/irmina" b=[("Lritaunas Peki Project", ""), ("Spanish users of LaTeX(en Espanyol)", "/pyttex.htm" ), ("page of drawing utility for tex ", "/texpython.htm" ), ("CrossWordsLand","/cruo/cruo.html") ] for i in b: print i[0],":", a+i[1] You can never tell which way the train went by looking at the tracks. From Gerrit Holl Tue Nov 16 16:07:28 1999 From: Gerrit Holl (Gerrit Holl) Date: Tue, 16 Nov 1999 17:07:28 +0100 Subject: [Doc-SIG] Re: SMTP In-Reply-To: <80rn4o$bnl$1@news.fsu.edu> References: <80rn4o$bnl$1@news.fsu.edu> Message-ID: <19991116170728.A30970@optiplex.palga.uucp> Glenn Kidd wrote: > Does anyone know where some good info on Python's smtplib? Well, er... what about the default module index? > I have looked at > http://www.python.org but I was wondering if there was any more info out > there. Any help would be appreciated. Isn't it enough? body = '''From: me To: my brother Subject: hello! X-Mailer: a python script This is the bodddddddddyyyyyyyyyy........ ''' regards, Gerrit. -- We are using Linux daily to UP our productivity - so UP yours! (Adapted from Pat Paulsen by Joe Sloan) From Gerrit Holl Tue Nov 16 19:13:02 1999 From: Gerrit Holl (Gerrit Holl) Date: Tue, 16 Nov 1999 20:13:02 +0100 Subject: [Doc-SIG] Re: SMTP In-Reply-To: <19991116170728.A30970@optiplex.palga.uucp> References: <80rn4o$bnl$1@news.fsu.edu> <19991116170728.A30970@optiplex.palga.uucp> Message-ID: <19991116201302.A583@optiplex.palga.uucp> Gerrit Holl wrote: > Glenn Kidd wrote: > > Does anyone know where some good info on Python's smtplib? > > Well, er... what about the default module index? [cut] Please ignore... At first my content was interesting for this list but then I changed my content and I forgot to change the CC: also... Sorry! -- "A word to the wise: a credentials dicksize war is usually a bad idea on the net." (David Parsons in c.o.l.development.system, about coding in C.) From paul@prescod.net Mon Nov 22 03:46:22 1999 From: paul@prescod.net (Paul Prescod) Date: Mon, 22 Nov 1999 04:46:22 +0100 Subject: [Doc-SIG] Approaches to structuring module documentation References: <14378.63037.571200.652453@weyr.cnri.reston.va.us> Message-ID: <3838BC8E.B3CA2663@prescod.net> Sorry for the delay on this message. I need a long plane flight to be able to think about this issue properly. "Fred L. Drake, Jr." wrote: > > Well, now that things have quieted down a little (where?!), I'll > stir things up a little. > Two broad approaches to structuring the documentation have been > presented: One is the current document-centric model, where there are > a number of books/manuals/whatever that contain interesting > information, but need to be used as really large chunks. Extracting > specific information is (appearantly) difficult for humans (witness > the recent request for a random() function on the newsgroup by someone > who said they looked in the index; just the wrong one); I'm preaching to the choir when I say that there are three issues here: * content * structure of the content * presentation Okay, they are all related but they are still different. If somebody can't find something, I would tend to try to fix that in the presentation first, and then in the content and finally in the structure if all else fails. Let's not jump right to the structure. Consider this analogy: someone using a word processor cannot figure out how to bold something so we decide to change the file format? Sure, there is a small chance that the file format is to blame (e.g. it doesn't support bold!) but it is much more likely just a UI problem. Also, consider our choices a graph with two axes: * specificity of markup * granularity ("library", "package", "module", "class) of file If you think of it that way, then you realize that you could have a very generic microdocument architecture (one HTML class per symbol) or a very specific (PyBook) but ungranular (the WHOLE book) DTD. And of course the other two options are also availble. > This approach has the advantage of matching the current structure of > the documentation. The conversion isn't terribly difficult or even > time consuming given the state of the things in Doc/tools/sgmlconv/ in > the CVS repository. There's clearly some work to do regarding DTD > specification and probably a bit of transformation, but a large part > of the coding and testing is done. I believe that this advantage strongly overwhelms any benefits of going to a more theoretically pure markup. It's taken roughly a year to get our documents clean enough that they can even move to XML or something similar. How long would it take to completely reorganize them? You, Fred, have a job that only partially includes documentation maintenance and the rest of us are not nearly so interested in re-writing DOCS as we are in re-writing CODE. I fear that a move to Microdocuments would never happen. > This explosion of markup is of most concern for authors; a lot of > markup is required to encode enough information to justify changing > the approach. As more markup is required, it is increasingly > difficult to get contributions because it takes the authors more time > to document their work. I'd like to maintain Python's standing as the > best-documented free scripting language, and I'm not sure authors will > be willing to use the more extensive markup. That's a killer argument. > The hybrid approach can be considered as roughly the same as the > microdocument approach, as discussed above. I propose an incremental approach. Let's get to PyBook XML and THEN re-evaluate PyBook in terms of microdocument. Here's an important issue: Perl and Java have achieved a relatively high level of module documentation conformity by putting the microdocument *in the code*. This appeals strongly to basic human nature. One file instead of two. Scroll to the top to fix up the documentation, and so forth. Python 2 should address this by having a first-class documentation feature built into the grammar. I would advise that it should NOT be XML. In fact it should probably be roughly JavaDOC or POD-ish so that we aren't reinventing the wheel. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself Bart: Dad, do I really have to brush my teeth? Homer: No, but at least wash your mouth out with soda. From Manuel Gutierrez Algaba Mon Nov 22 18:49:22 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Mon, 22 Nov 1999 18:49:22 +0000 (GMT) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: <3838BC8E.B3CA2663@prescod.net> Message-ID: On Mon, 22 Nov 1999, Paul Prescod wrote: > Sorry for the delay on this message. I need a long plane flight to be > able to think about this issue properly. > > * content > * structure of the content > * presentation > > Okay, they are all related but they are still different. If somebody > can't find something, I would tend to try to fix that in the > presentation first, and then in the content and finally in the structure > if all else fails. Let's not jump right to the structure. Consider this > analogy: someone using a word processor cannot figure out how to bold > something so we decide to change the file format? Sure, there is a small I agree. The first issue is supply content to the user, and the structure of the content will be resolved when we have enough content to build it. The TeEncontreX (www.ctv.es/USERS/irmina/TeEncontreX.html) provides the less structure possible, it's pure marking of contents and a bit of relationship between contents, and it's pure presentation.. I really can't understand why people don't get interested in it. I think most people think that the complex solution for a problem is the best solution.... > Also, consider our choices a graph with two axes: > > * specificity of markup > * granularity ("library", "package", "module", "class) of file specifity of markup | I J | X | | | T |------------------------------------ library package module class granularity T : TeEncontreX J: javadoc X: XML I: emacs texinfo ?? It'd be this way ? If so, it's clear that specific markup makes it more difficult, and that granularity is not a problem. > documentation feature built into the grammar. I would advise that it > should NOT be XML. In fact it should probably be roughly JavaDOC or > POD-ish so that we aren't reinventing the wheel. I think we must invent the wheel, or at least improve it. I don't like javadoc, it seems to me very low level ( type-driven ), useful for java, but python deserves a higher level stuff. The doc of a language is related to the very nature of that language, it's not the same a prolog documentation than COBOL doc. As I think Python is Lisp with OO, it should have a Lisp-ish doc... The question is : how is Lisp doc ? :-) BTW: Are you the Paul Prescod who wrote "Manual de XML" with Charles F. GoldFarb . Prentice Hall and... ? Regards/Saludos Manolo www.ctv.es/USERS/irmina/TeEncontreX.html /texpython.htm Nasrudin walked into a teahouse and declaimed, "The moon is more useful than the sun." "Why?", he was asked. "Because at night we need the light more." From fdrake@acm.org Mon Nov 22 18:37:08 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 22 Nov 1999 13:37:08 -0500 (EST) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: <3838BC8E.B3CA2663@prescod.net> References: <14378.63037.571200.652453@weyr.cnri.reston.va.us> <3838BC8E.B3CA2663@prescod.net> Message-ID: <14393.36180.24259.143634@weyr.cnri.reston.va.us> Paul Prescod writes: > * content > * structure of the content > * presentation A reminder of what the axes really are is always nice! > Okay, they are all related but they are still different. If somebody > can't find something, I would tend to try to fix that in the > presentation first, and then in the content and finally in the structure > if all else fails. Let's not jump right to the structure. Consider this > Also, consider our choices a graph with two axes: > > * specificity of markup > * granularity ("library", "package", "module", "class) of file > > If you think of it that way, then you realize that you could have a very > generic microdocument architecture (one HTML class per symbol) or a very > specific (PyBook) but ungranular (the WHOLE book) DTD. And of course the > other two options are also availble. I'd describe the current markup as being highly specific, and I think that makes authoring much easier in many ways (there's a limit to what needs to be typed to mark something in an interesting way). However, there are a bunch of marks that can be reasonably made, and there are a few people out there who think documentation isn't intrinsically interesting(!); this means they don't read the documentation for the markup (which is incomplete anyway), and there's some resistence to having to type much to "mark" something. This leads me to think that less marking would be nice. I said: > This approach has the advantage of matching the current structure of > the documentation. The conversion isn't terribly difficult or even > time consuming given the state of the things in Doc/tools/sgmlconv/ in [...] Paul Prescod writes: > I believe that this advantage strongly overwhelms any benefits of going > to a more theoretically pure markup. It's taken roughly a year to get > our documents clean enough that they can even move to XML or something I'm not convinced. If what we end up with is little different from what we have, I don't see why we need to convert at all. There are plenty of people who don't *like* LaTeX syntax, but those people won't be any happier with XML; I'd expect them to be less happy because there's more characters in the syntax. (On the other hand, the syntax is more clearly defined and involves fewer special characters, which is one of the advantages Guido sees with XML or even a carefully chosen SGML declaration.) > similar. How long would it take to completely reorganize them? You, > Fred, have a job that only partially includes documentation maintenance > and the rest of us are not nearly so interested in re-writing DOCS as we > are in re-writing CODE. I fear that a move to Microdocuments would never > happen. > That's a killer argument. That's been my biggest concern about it all. When working with this, I'm often in a quandry over how to get more detail out of the documentation without ending up being the author of the whole ball of wax. > I propose an incremental approach. Let's get to PyBook XML and THEN > re-evaluate PyBook in terms of microdocument. Does an incremental approach really make sense? I suspect we want to avoid having to give module authors a new set of tools to do (essentially) the same thing too often. Regardless of the merits of the new tools. (Where "tools" can include things like markup vocabularies and syntax.) This is a problem because it leads to increased resistence from potential authors. > Here's an important issue: Perl and Java have achieved a relatively high > level of module documentation conformity by putting the microdocument > *in the code*. This appeals strongly to basic human nature. One file After talking with Guido about these issues last week, I've been looking into this more. I've been discussing the benefits & failings of POD with Greg Ward (of distutils fame), who was a Perl programmer well before he learned Python. Needless to say, he's a *huge* fan of inline documentation (and lots of it). So I've been playing with a little tool to create documentation from a Python parse tree. As with many things, it's been done before, but with limited success (docco, gendoc, pythondoc). I suspect the success rate is probably tightly with it being declared "the right way" by Guido. The script isn't near ready, but I'm aiming for being able to generate documentation one module or one package at a time with at least reasonable levels of internal linking among HTML files (other formats can wait; I want a hypertext format first to make sure I get the linking right). Once I have this, I should be able to construct a system that allows the docs to be created using either some XML/SGML language in a separate file or this POD-like/structured text inside the source file. Building a reference manual from those inputs would be very similar to what we have now, and is more a matter of gluing pieces together. Another advantage of using inline documentation in the sources is that the source can be used as part of the markup; a lot of information is already in the parse tree. Using that information to augment the explicit documentation may prove to be very valuable, especially for people interested in including lots of specific details in the documentation. The programmer should be able to declare that this not be done, preferably at both global and local levels within a package or module, since there are many situations in which the specific structure of the code is downright misleading in terms describing the public interface. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From paul@prescod.net Tue Nov 23 14:38:04 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 23 Nov 1999 15:38:04 +0100 Subject: [Doc-SIG] Approaches to structuring module documentation References: <14378.63037.571200.652453@weyr.cnri.reston.va.us> <3838BC8E.B3CA2663@prescod.net> <14393.36180.24259.143634@weyr.cnri.reston.va.us> Message-ID: <383AA6CC.FE4E1A8C@prescod.net> "Fred L. Drake, Jr." wrote: > > I'm not convinced. If what we end up with is little different from > what we have, I don't see why we need to convert at all. I think we want to be able to slice and dice the documentation with tools like Zope, 4XSLT, xt.exe and Internet Explorer 5.0 > So I've been playing with a little tool to create documentation from > a Python parse tree. Would the right compromise be to have very specific documentation inside the Python and relatively generic documentation outside? The reason I proposed that we might need a grammar change is because when I last thought about this I couldn't figure out how to document parameters without repeating their names. Maybe that's not such a significant thing though. Also, I was trying to avoid using comments because I wanted the same documentation to be available as docstrings. -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself It's such a Bore Being always Poor LANGSTON HUGHES http://www.northshore.net/homepages/hope/engHughes.html From paul@prescod.net Tue Nov 23 14:36:22 1999 From: paul@prescod.net (Paul Prescod) Date: Tue, 23 Nov 1999 15:36:22 +0100 Subject: [Doc-SIG] Approaches to structuring module documentation References: Message-ID: <383AA666.1BE107AD@prescod.net> Manuel Gutierrez Algaba wrote: > > I think we must invent the wheel, or at least improve it. I don't > like javadoc, it seems to me very low level ( type-driven ), useful > for java, but python deserves a higher level stuff. I don't follow. Perhaps you could give an example. Anyhow, we want to be very careful not to stray too far into only accepting a solution that is "best" and not the one that is good enough. > BTW: Are you the Paul Prescod who wrote "Manual de XML" with > Charles F. GoldFarb . Prentice Hall and... ? Well, I will admit to being the co-author of the XML Handbook. I will decide my relationship to "Manual de XML" when you tell me how good the translation was. :) -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself It's such a Bore Being always Poor LANGSTON HUGHES http://www.northshore.net/homepages/hope/engHughes.html From fdrake@acm.org Tue Nov 23 16:51:48 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 23 Nov 1999 11:51:48 -0500 (EST) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: <383AA6CC.FE4E1A8C@prescod.net> References: <14378.63037.571200.652453@weyr.cnri.reston.va.us> <3838BC8E.B3CA2663@prescod.net> <14393.36180.24259.143634@weyr.cnri.reston.va.us> <383AA6CC.FE4E1A8C@prescod.net> Message-ID: <14394.50724.408903.46036@weyr.cnri.reston.va.us> Paul Prescod writes: > I think we want to be able to slice and dice the documentation with > tools like Zope, 4XSLT, xt.exe and Internet Explorer 5.0 An excellent point. It would be nice to be able to use something other than PyDOM to manipulate the Python documentation! ;-) > Would the right compromise be to have very specific documentation inside > the Python and relatively generic documentation outside? It's not such a clean division, I think. I'm not doing anything about extension modules, so I need to be able to provide documentation about those modules outside the source code. > The reason I proposed that we might need a grammar change is because > when I last thought about this I couldn't figure out how to document > parameters without repeating their names. Maybe that's not such a > significant thing though. Also, I was trying to avoid using comments > because I wanted the same documentation to be available as docstrings. I don't think it's really significant; we can't use an IDREF attribute in SGML/XML without repeating the ID assigned to the target, and the locality of reference is a much greater problem there. The Python Tutorial and Guido's "Python Style Guide" essay both describe some ways to format docstrings such that information extraction is isn't too hard; those guidelines can be augmented a bit and combined with limited markup using something that looks like the paragraph- level analysis from the old structured-text discussion a more explicit (but minimal) markup similar to POD's C<...> for inline constructs. This sort of in-source documentation, with a little intelligent analysis, can be used to generate XML, HTML, or whatever fairly easily. If a good module-reference DTD can be created (even if part of a macro-document DTD), that can provide for both the XML output from the extraction tool and an authoring format for extension modules (or other modules if the author has reasons for not using the in-source documentation; the doc author may work for a different organization, etc.). -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From irmina@ctv.es Tue Nov 23 18:52:27 1999 From: irmina@ctv.es (Manuel Gutierrez Algaba) Date: Tue, 23 Nov 1999 18:52:27 +0000 (GMT) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: <383AA666.1BE107AD@prescod.net> Message-ID: On Tue, 23 Nov 1999, Paul Prescod wrote: > Manuel Gutierrez Algaba wrote: > > > > I think we must invent the wheel, or at least improve it. I don't > > like javadoc, it seems to me very low level ( type-driven ), useful > > for java, but python deserves a higher level stuff. > > I don't follow. Perhaps you could give an example. Anyhow, we want to be > very careful not to stray too far into only accepting a solution that is > "best" and not the one that is good enough. I have had these doubts when working with python: - How to do a specific thing ? ( how to get a random number? how to get the time? which module or which builtin function ? ) - If python can do : -1 * ( 3, 5). It can't, although it could give : (-3, -5). - How to inheritate safely ? - How to use CGI or Unix related stuff with python ? - How much memory a construction takes? How the memory is released ? How to improve speed a bit ? Most of my doubts can be resumed into one: The need of "high level information". I've never needed to know the signature of a function, or If i needed it , i found it seamlessly. I don't need javadoc, nor pythondoc, not at all certainly, I can read the .py , which are more compact than html info of java, usually. Lots of people, specially newbies, have the same problem: Doubts but not about the parameters of a function, ... real BIG DOUBTS! I don't care either about the genealogy of any class, if I care, I just follow the code! Even TkInter code! The funny (or sad) thing is that most of the info is available, out there, hidden in nice layouts and documents, spread over USENET, FAQs, modules, tutorial.... Imagine that you want to produce html code. Well, I'd go to /usr/local/lib/pyt... and then I'd have a glance to htmllib, well, it's undocumented ! or at least the 1.5.2 version I have, but that seems a parser, not a writer of html... Well, it happens that the StructuredText.py of pythondoc does exactly what I want. How could I known that ? Simply marking it with: \indexhtmlgeneration or something similar. Lots and lots of programms and modules perform auxiliary tasks, or many of their tasks are reusable ( you know , OO programming) or they have small pieces of info ( examples of UNIX CGI programming, environment variables, tk, ...). Well, If that's we want to, why don't we just do it? We have only to MARK that small pieces of info. Fred calls this indexing. Many people, me included, won't accept a method of doc that requires a sintax or the loss of freedom when doing things ( this includes docstrings ). But it's acceptable to write python code, and then put: "\indexexamplelambda" or "\indexsocket" because that marking would be useful even for the programmer himself, even he eventually got accostumed to doc things just writting indexes and ... but that'd be the second and third step. Anyway, most people won't be angry against such a simple measure. I suppose that this is like Parnassus's stuff: put a black background, some .gif's, and PostGress database and you got a GREAT site. :-P If Fred thinks that providing a "smart" way of showing how many modules, functions and expections are in a piece of python code is enough, I think that with that we don't go much further. Here the Pop SuperStar is "Information" ( brute, massive, overwhelming info... ). We need just a method of gluing as fast and as simple as possible "tons" of info. And that method is indexing ( as brute, massive and overwhelming as "Information" is). No traditional parser-driven or XML-driven or javadoc-driven approach will bear the richness, complexity, diversity of origin/source, state,... of so many info. BTW: Documenting the python library would be the minor and less interesting thing by far. We have a Golden Chance here, we can have the best info system of the Internet. Don't try to be The Big Hero, the Big guy who finds the smart solution. Here the Only One Hero is Information, the Information that should be at last available ! Indexing for the masses !! Masses of indexes !! Simple and effective! Death or victory ! Regards/Saludos Manolo www.ctv.es/USERS/irmina/TeEncontreX.html /texpython.htm Just remember, wherever you go, there you are. -- Buckaroo Bonzai From fdrake@acm.org Tue Nov 23 22:13:46 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 23 Nov 1999 17:13:46 -0500 (EST) Subject: [Doc-SIG] Re: Documenting Python, Take 2 In-Reply-To: <199911232129.OAA03343@localhost.localdomain> References: <14378.63037.571200.652453@weyr.cnri.reston.va.us> <199911232129.OAA03343@localhost.localdomain> Message-ID: <14395.4506.538171.718001@weyr.cnri.reston.va.us> uche.ogbuji@fourthought.com writes: > F1) A highly-structured format for archiving and manipulation of low-level > documentation (what Fred Drake is calling "microdocuments"). Example: SGML > or XML schema. This format must be semantically complete, easy to > manipulate in code, with a broad toolset available for manipulation. Uche, I'd *love* to see a good definition of "semantically complete" for Python! ;-) > F2) An author-friendly format for low-level documentation. F2 has to be > structured enough for meaningful conversion to F1, but terse enough for use > in in-line documentation and adoption by authors for whom F1 would be too > much of a chore. Example: javadoc, POD. I'm working on this one, along with an extraction tool. > F3) An author and maintainer-friendly format for general documentation, > such as the Python profiler and debugger docs as well as the User guide and > all that. Example: Docbook, RTF. Abundance of author and manipulation > tools is important for this format. Yes; I think SGML/XML is probably fine for this. > T1: A tool for conversion from F1 to F2 and back. I understand the need for F2-->F1, but why F1-->F2? It certainly could not be general unless F1 is heavier than I imagine. Please provide the rationale for the F1-->F2 requirement. > T2: A tool for interactively querying authors for documentation elements: > basically a knowledge-acquisition tool from python module experts. (Maybe > you can guess what one of our recent contracts has been). This might be cool. We could then go from a parse tree (.py file; F2) to skeletal F1, and then augment using the interactive tool. In practice, perhaps you really point the tool to the source file and skip storing the skeletal F1 to disk if you aren't going to intervene with a text editor at that point. Allowing either to be accepted by the tool is probably a good idea. That would allow both documentation creation and editing within the tool. > T3: A tool for generating user-friendly doc-strings into python modules > from the information in F1. This sounds the same as T1 to me; do you see F2 being used outside of docstrings? (I've been working under the rubric that I should pull as much as possible out of the code to get the best possible docs when the programmer doesn't provide any additional information.) > T4: A command line tool that can display user-friendly docs from a > database of F1 docs, similar to perldoc. Agreed. > T5: A tool for turning F1 and F3 into the familiar Python User Guide and > Library Reference, preferably with richer linking. That's within my current plans (i.e., I haven't written it yet). Rich hypertext is one of the most important benefits I see for ditching the current tool chains -- doing much through LaTeX2HTML can be quite painful! > T6: A tool for generating man-pages based on F1 Documentation. This would > address the insistent crowing of Tom Christiansen about Python's "man-page > envy" Perhaps we could ask Tom to write this? ;) Since Tom's last crusade against the Python documentation, I've had one user comment that they'd like to see man pages for Python (paraphrased: it'd be nice to have). Tom's the only user to say that the rest doesn't count (regardless of how many words he took to say it). > In a separate message, I'll make a proposal based on this meta-proposal. I look forward to it! -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From fdrake@acm.org Tue Nov 23 21:03:33 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 23 Nov 1999 16:03:33 -0500 (EST) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: References: <383AA666.1BE107AD@prescod.net> Message-ID: <14395.293.372163.433259@weyr.cnri.reston.va.us> Manuel Gutierrez Algaba writes: > Well, If that's we want to, why don't we just do it? We have > only to MARK that small pieces of info. Fred calls this indexing. Please don't think I'm the only one! I just applied the name to the situation. Indexing is a good thing. > Many people, me included, won't accept a method of doc that requires > a sintax or the loss of freedom when doing things ( this includes > docstrings ). But it's acceptable to write python code, and then > put: > "\indexexamplelambda" or > "\indexsocket" > because that marking would be useful even for the programmer himself, > even he eventually got accostumed to doc things just writting indexes and > ... but that'd be the second and third step. Anyway, most people > won't be angry against such a simple measure. The "simple" part of this isn't the problem, though it does turn into one. This approach hinges on what's called a "controlled vocabulary" (another one of those good Information Science words!). Without some agreement on the terms that enter the index, many related things are not similarly indexed. Achieving consistency in the case of many author/indexers is very difficult without either a well- defined controlled vacobulary or strong editorial oversight. The later is (by far) easier to implement, and is what I've tried to provide for the standard documentation. (Compare this to a controlled vocabulary approach; think "Library of Congress Subject Headings," or other large cataloging systems used in libraries. Ever wonder why programming language books appear in at least a couple of different places in the computer science section of a good university library?) Editorial control is tedious and can become difficult; but controlled vocabularies are the child of committees! (Which doesn't mean they're not useful, just that there's an *enormous* overhead to using them.) > If Fred thinks that providing a "smart" way of showing how many > modules, functions and expections are in a piece of python code is > enough, > I think that with that we don't go much further. Actually, I don't think that's enough, or that it solves that particular problem. The purpose of extracting information from the Python sources is not so much as to provide new information (though it may) as to ease the burden on those authoring documentation. (Which does *not* mean me!) I'd like to see newly released modules from independent developers be documented in a consistent way; making this easy is a necessity for it to happen. There are still several things which have to be done, including index building. One of the catches of index building is that building a really useful index (not just a comprehensive one) is fundamentally a hard thing to do. I recently spoke to someone who once managed half of the indexing team at the Encyclopedia Britanica about this, and find that it's not at all clear what actually needs to be done to improve the situation. A *large* index, especially when presented "book style," is not particularly desirable. > No traditional parser-driven or XML-driven or javadoc-driven approach > will bear the richness, complexity, diversity of origin/source, > state,... of so many info. I agree: No automatic method will replace good human indexing. > BTW: Documenting the python library would be the minor and less > interesting thing by far. I think the current library documentation is actually pretty good; I'm interested in improving both the content and accessibility (via indexing or any other approach). The Doc-SIG has long had the mandate of moving Python out of the LaTeX prehistoric period into the 21st century, however, which is one of motivations for the work done to move from LaTeX to SGML/XML/whatever-comes-next. I know I've been beating on Guido about this for 4 1/2 years! > We have a Golden Chance here, we can have the best info system > of the Internet. Don't try to be The Big Hero, the Big guy who finds > the smart solution. Here the Only One Hero is Information, the > Information that should be at last available ! Perhaps we need another defined task for the SIG: locate all the resources that should be part of this all-encompassing Python Documentation Web? That's no small task! Perhaps you'd like to start of list of the documents you think should be included in the indexing effort, including current links to them. A Web page that simply lists them would be a good start. > Death or victory ! Don't do that! While alive you can work to improve things, once dead... well, *I've* never met anyone who came back. -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From uche.ogbuji@fourthought.com Tue Nov 23 21:29:59 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Tue, 23 Nov 1999 14:29:59 -0700 Subject: [Doc-SIG] Documenting Python, Take 2 In-Reply-To: Your message of "Thu, 11 Nov 1999 12:00:45 EST." <14378.63037.571200.652453@weyr.cnri.reston.va.us> Message-ID: <199911232129.OAA03343@localhost.localdomain> Sorry it took so long to get around to this. I think my earlier approach (let's call it a meta-proposal) to settling on a Python documentation system still applied, with some modifications. My original posting is at http://www.python.org/pipermail/doc-sig/1999-September/000726.html But in light of the current discussion and concerns that have been raised, changes are in order. Python Documentation Python Meta-Proposal ----------------------------------------- I think in essence we must quickly decide on a set of documentation formats and enabling tools, and then answer the questions of how to get there from where we are. A step-wise transition, as Paul suggests, is fine, but I think it is important for us all to have a vision of where we're going. FORMATS: F1) A highly-structured format for archiving and manipulation of low-level documentation (what Fred Drake is calling "microdocuments"). Example: SGML or XML schema. This format must be semantically complete, easy to manipulate in code, with a broad toolset available for manipulation. F2) An author-friendly format for low-level documentation. F2 has to be structured enough for meaningful conversion to F1, but terse enough for use in in-line documentation and adoption by authors for whom F1 would be too much of a chore. Example: javadoc, POD. F3) An author and maintainer-friendly format for general documentation, such as the Python profiler and debugger docs as well as the User guide and all that. Example: Docbook, RTF. Abundance of author and manipulation tools is important for this format. CUSTOM TOOLS: T1: A tool for conversion from F1 to F2 and back. T2: A tool for interactively querying authors for documentation elements: basically a knowledge-acquisition tool from python module experts. (Maybe you can guess what one of our recent contracts has been). T3: A tool for generating user-friendly doc-strings into python modules from the information in F1. T4: A command line tool that can display user-friendly docs from a database of F1 docs, similar to perldoc. T5: A tool for turning F1 and F3 into the familiar Python User Guide and Library Reference, preferably with richer linking. T6: A tool for generating man-pages based on F1 Documentation. This would address the insistent crowing of Tom Christiansen about Python's "man-page envy" In a separate message, I'll make a proposal based on this meta-proposal. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From janssen@parc.xerox.com Wed Nov 24 03:00:55 1999 From: janssen@parc.xerox.com (Bill Janssen) Date: Tue, 23 Nov 1999 19:00:55 PST Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: Your message of "Tue, 23 Nov 1999 13:03:33 PST." <14395.293.372163.433259@weyr.cnri.reston.va.us> Message-ID: <99Nov23.190055pst."3638"@watson.parc.xerox.com> Well, I might as well put in my two cents worth. The GNU Info versions of the Python documentation are the most important to me, as I can put those right into my Emacs and have them at the tip of my fingers while programming. Whatever solution is found, I'd like to see that continued. There's some logic to javadoc, I suppose, in that the most common problem with documentation is that it goes out of date, because the modifier just changes the code. If the documentation is mixed with the code, perhaps that probability is reduced (though I'm not aware of any studies that show this to be true). Though something like Literate Programming might be a better system. Perhaps adapting a system like Noweb (http://www.eecs.harvard.edu/~nr/noweb/) would be of help. The home page lists a number of programming languages, but not Python. Bill From fdrake@acm.org Wed Nov 24 15:07:27 1999 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Wed, 24 Nov 1999 10:07:27 -0500 (EST) Subject: [Doc-SIG] Approaches to structuring module documentation In-Reply-To: <99Nov23.190055pst."3638"@watson.parc.xerox.com> References: <14395.293.372163.433259@weyr.cnri.reston.va.us> <99Nov23.190055pst."3638"@watson.parc.xerox.com> Message-ID: <14395.65327.352843.663682@weyr.cnri.reston.va.us> Bill Janssen writes: > Well, I might as well put in my two cents worth. Surely we can ask three or four cents worth from you; we've certainly not hesitated sending comments to the ILU list at various times! ;-) > The GNU Info versions of the Python documentation are the most > important to me, as I can put those right into my Emacs and have them > at the tip of my fingers while programming. Whatever solution is > found, I'd like to see that continued. I don't expect there will be any reduction in the set of output formats. Which is not to say that info will be the first one produced, but it's a safe bet it'll stay around. I suspect it'll be far easier to maintain if it's no longer dependent on the HTML rendering of the docs as well. > There's some logic to javadoc, I suppose, in that the most common > problem with documentation is that it goes out of date, because the > modifier just changes the code. If the documentation is mixed with > the code, perhaps that probability is reduced (though I'm not aware of > any studies that show this to be true). Though something like I think I've seen references to studies that showed it both ways, so I suspect that the specific set of programmers studied remains a poorly understood variable (hey, we're not numbers, we're variables!). I hope that a moderate amount of what gets marked up in JavaDoc comments would be generated automatically for Python, but I suspect that will be very difficult and prone to be wrong if people make heavy use of Python's dynamic features. But that's the case now as well, and screwing with the standard library is a de-facto no-no. > Literate Programming might be a better system. Perhaps adapting a > system like Noweb (http://www.eecs.harvard.edu/~nr/noweb/) would be of > help. The home page lists a number of programming languages, but not > Python. Now John Skaller will probably suggest that we all adopt Interscript. ;-) I'm not entirely sure what we'd expect to get out of a literate programming system that we can't get out of a JavaDoc/POD/Structured- Text-derived system, at least as far as module reference material goes. I've not done enough literate programming to be a good judge of how well it really works for library code. I'd love it if someone would send me a pointer to a really nice example of literate programming of a library that provided reference documentation, introductory material, and examples of use. (Esp. if there's both online and typeset versions of the documentation to look at.) I would *not* expect this to be applied to the Python libraries, however, since there are too many hands in there to get a major shift in methodology. Just getting decent docstrings will be hard enough some days! -Fred -- Fred L. Drake, Jr. Corporation for National Research Initiatives From paul@prescod.net Wed Nov 24 15:25:57 1999 From: paul@prescod.net (Paul Prescod) Date: Wed, 24 Nov 1999 16:25:57 +0100 Subject: [Doc-SIG] Approaches to structuring module documentation References: Message-ID: <383C0385.10EC2C0D@prescod.net> Manuel Gutierrez Algaba wrote: > > Most of my doubts can be resumed into one: The need of "high level > information". The lack of documentation for high level features (or the poor indexes for them) cannot be solved by new documentation systems. It must be solved by new *documentation* (or at least indexes). There is nothing in the current system that precludes solving the system you describe. > I've never needed to know the signature of a function, > or If i needed it , i found it seamlessly. I don't need javadoc, > nor pythondoc, not at all certainly, I can read the .py , which > are more compact than html info of java, usually. But not hypertext navigable, not nicely formatted and not appropriate for printing out. > > "\indexexamplelambda" or > "\indexsocket" > Indexing for the masses !! Masses of indexes !! Simple and effective! I agree with a need for indexing, but I think it is a separate issue with separate solutions. Those solutions are mostly unrelated to the markup strategy problem. As you point out, the markup for indexing is simple. Has Fred points out, the name management is tricky! -- Paul Prescod - ISOGEN Consulting Engineer speaking for himself "Like most religious texts, the XML 1.0 spec has proven itself internally-inconsistent, so we're going to have to invent some kind of exegetical method now to show how it's really all an allegory." - Anon From gerrit@nl.linux.org Wed Nov 24 19:27:51 1999 From: gerrit@nl.linux.org (Gerrit Holl) Date: Wed, 24 Nov 1999 20:27:51 +0100 Subject: [Doc-SIG] Re: SMTP? In-Reply-To: <19991124171120.1623.qmail@hotmail.com>; from b2blink@hotmail.com on Wed, Nov 24, 1999 at 06:11:20PM +0100 References: <19991124171120.1623.qmail@hotmail.com> Message-ID: <19991124202751.A5717@stopcontact.palga.uucp> Ulf Engstrøm wrote: > I've build a little mailthingy based on the 11.9.2 SMTP Example from Python > Library Reference but when I use it I'll get an empty mail with no sender > and no msg, eventhough I don't get any errors whatsoever...Do I have to > change something with the headers? If yes, what? Hmm, questions about smtplib are asked *very* often. Is the documentation clear enough? -- "The move was on to 'Free the Lizard'" -- Jim Hamerly and Tom Paquin (Open Sources, 1999 O'Reilly and Associates) 8:26pm up 2:50, 9 users, load average: 2.07, 1.98, 1.92 From uche.ogbuji@fourthought.com Thu Nov 25 02:00:45 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Wed, 24 Nov 1999 19:00:45 -0700 Subject: [Doc-SIG] Documenting Python, Take 2 (RE-POST) In-Reply-To: Your message of "Thu, 11 Nov 1999 12:00:45 EST." <14378.63037.571200.652453@weyr.cnri.reston.va.us> Message-ID: <199911250200.TAA07515@localhost.localdomain> Pardon me if you get this twice, but I had problems with mail at python.org yesterday Sorry it took so long to get around to this. I think my earlier approach (let's call it a meta-proposal) to settling on a Python documentation system still applied, with some modifications. My original posting is at http://www.python.org/pipermail/doc-sig/1999-September/000726.html But in light of the current discussion and concerns that have been raised, changes are in order. Python Documentation Python Meta-Proposal ----------------------------------------- I think in essence we must quickly decide on a set of documentation formats and enabling tools, and then answer the questions of how to get there from where we are. A step-wise transition, as Paul suggests, is fine, but I think it is important for us all to have a vision of where we're going. FORMATS: F1) A highly-structured format for archiving and manipulation of low-level documentation (what Fred Drake is calling "microdocuments"). Example: SGML or XML schema. This format must be semantically complete, easy to manipulate in code, with a broad toolset available for manipulation. F2) An author-friendly format for low-level documentation. F2 has to be structured enough for meaningful conversion to F1, but terse enough for use in in-line documentation and adoption by authors for whom F1 would be too much of a chore. Example: javadoc, POD. F3) An author and maintainer-friendly format for general documentation, such as the Python profiler and debugger docs as well as the User guide and all that. Example: Docbook, RTF. Abundance of author and manipulation tools is important for this format. CUSTOM TOOLS: T1: A tool for conversion from F1 to F2 and back. T2: A tool for interactively querying authors for documentation elements: basically a knowledge-acquisition tool from python module experts. (Maybe you can guess what one of our recent contracts has been). T3: A tool for generating user-friendly doc-strings into python modules from the information in F1. T4: A command line tool that can display user-friendly docs from a database of F1 docs, similar to perldoc. T5: A tool for turning F1 and F3 into the familiar Python User Guide and Library Reference, preferably with richer linking. T6: A tool for generating man-pages based on F1 Documentation. This would address the insistent crowing of Tom Christiansen about Python's "man-page envy" In a separate message, I'll make a proposal based on this meta-proposal. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Thu Nov 25 09:27:29 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Thu, 25 Nov 1999 02:27:29 -0700 Subject: [Doc-SIG] Documenting Python, Take 2 Message-ID: <199911250927.CAA08568@localhost.localdomain> Based on my meta-proposal, here is my suggestion for the Zen of Python documentation. My vote for format F1 is an XML schema. Fred's wonder about "semantically complete is duly noted. Let's have this as a start Module name Module description Global Object References Functions Description Parameters (name and description) Return value (description) Classes Methods (see functions, maybe flag for initializer) Class-level object refs etc. Fred's example from his original message is a decent start. Remember that F1 needn't be terse. My vote for F2 is a modification of javadoc. It's very well known and very successful. Off-head, we should be able to use @version @author @param @return @exception @see without modification. "@see" would be _very_ nice, wouldn't it? We would need some additions, such as @module. My vote for F3 is docbook. There are tools to turn docbook into HTML, GNU info, *roff (man pages), ps, pdf, etc. There is an O'Reilly book out on it, an emacs mode, etc. I would volunteer to write a python-javadoc to XML converter. Note: I'm not saying I won't help unless my suggestions are accepted. I'm just volunteering for a known quantity that I know I can handle. FourThought already has an internal tools for querying an author for documentation automatically. We could adapt this to the new DTD that is determined, and donate it to the cause. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From uche.ogbuji@fourthought.com Thu Nov 25 09:55:25 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Thu, 25 Nov 1999 02:55:25 -0700 Subject: [Doc-SIG] Re: Documenting Python, Take 2 In-Reply-To: Your message of "Tue, 23 Nov 1999 17:13:46 EST." <14395.4506.538171.718001@weyr.cnri.reston.va.us> Message-ID: <199911250955.CAA08660@localhost.localdomain> > uche.ogbuji@fourthought.com writes: > > F1) A highly-structured format for archiving and manipulation of low-level > > documentation (what Fred Drake is calling "microdocuments"). Example: SGML > > or XML schema. This format must be semantically complete, easy to > > manipulate in code, with a broad toolset available for manipulation. > > Uche, > I'd *love* to see a good definition of "semantically complete" for > Python! ;-) OK, I overstated it. A "semantically-complete" format would likely have to document all the nonterminals of the Python grammar. Let's say "reasonably complete". > > F2) An author-friendly format for low-level documentation. F2 has to be > > structured enough for meaningful conversion to F1, but terse enough for use > > in in-line documentation and adoption by authors for whom F1 would be too > > much of a chore. Example: javadoc, POD. > > I'm working on this one, along with an extraction tool. What does it look like? > > F3) An author and maintainer-friendly format for general documentation, > > such as the Python profiler and debugger docs as well as the User guide and > > all that. Example: Docbook, RTF. Abundance of author and manipulation > > tools is important for this format. > > Yes; I think SGML/XML is probably fine for this. Repeating myself, I vote Docbook. > > T1: A tool for conversion from F1 to F2 and back. > > I understand the need for F2-->F1, but why F1-->F2? It certainly > could not be general unless F1 is heavier than I imagine. Please > provide the rationale for the F1-->F2 requirement. My thinking was that some users would appreciate the concise form in their distro for quick reference without weighing doen their modules (and memory foot-print) with the heavyweight F1. > > T2: A tool for interactively querying authors for documentation elements: > > basically a knowledge-acquisition tool from python module experts. (Maybe > > you can guess what one of our recent contracts has been). > > This might be cool. We could then go from a parse tree (.py file; > F2) to skeletal F1, and then augment using the interactive tool. In > practice, perhaps you really point the tool to the source file and > skip storing the skeletal F1 to disk if you aren't going to intervene > with a text editor at that point. Allowing either to be accepted by > the tool is probably a good idea. That would allow both documentation > creation and editing within the tool. Well, I hadn't thought of the parse-tree angle, though that would be cool. The internal tool FourThought has in this vein is merely a menu-driven approach. The author selects "add new function", "modify class", and all that. It's up to him or her to determine the elements to be documented. We have plans for a web-i-fied version of the tool. > > T3: A tool for generating user-friendly doc-strings into python modules > > from the information in F1. > > This sounds the same as T1 to me; do you see F2 being used outside > of docstrings? (I've been working under the rubric that I should pull > as much as possible out of the code to get the best possible docs when > the programmer doesn't provide any additional information.) No. F2 in my mind is not user-friendly. I'm talking about something that converts "@param" to "parameter" with some salsa and bean dip tossed in to make it all palatable. > > T6: A tool for generating man-pages based on F1 Documentation. This would > > address the insistent crowing of Tom Christiansen about Python's "man-page > > envy" > > Perhaps we could ask Tom to write this? ;) Since Tom's last > crusade against the Python documentation, I've had one user comment > that they'd like to see man pages for Python (paraphrased: it'd be > nice to have). Tom's the only user to say that the rest doesn't count > (regardless of how many words he took to say it). Then let's just leave off that bit. Of course, if we use Docbook, making man pages should be no great endeavor. -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From fredrik@pythonware.com Thu Nov 25 11:38:44 1999 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 25 Nov 1999 12:38:44 +0100 Subject: [Doc-SIG] Re: SMTP? References: <19991124171120.1623.qmail@hotmail.com> <19991124202751.A5717@stopcontact.palga.uucp> Message-ID: <00cc01bf3739$a1d83d30$f29b12c2@secret.pythonware.com> Gerrit Holl wrote: > Ulf Engstrøm wrote: > > I've build a little mailthingy based on the 11.9.2 SMTP Example from Python > > Library Reference but when I use it I'll get an empty mail with no sender > > and no msg, eventhough I don't get any errors whatsoever...Do I have to > > change something with the headers? If yes, what? > > Hmm, questions about smtplib are asked *very* often. Is the documentation > clear enough? no. the library is pretty low-level, and the documentation doesn't add much (it points to the relevant RFC's, but nobody seems to be following those links). I once contributed a (IMHO) better example, which 1) actually imported all modules that were used in the example, 2) used more reasonable python con- structs (raw_input instead of that prompt hack, etc), and 3) showed how to add the basic headers to the message body. as far as I can tell, only (1) made it into the docs... From Manuel Gutierrez Algaba Fri Nov 26 19:51:06 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Fri, 26 Nov 1999 19:51:06 +0000 (GMT) Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion In-Reply-To: <14395.293.372163.433259@weyr.cnri.reston.va.us> Message-ID: I've decided on making a try of project for my proposal, as I think It's intrinsically good. I've placed at: http://www.ctv.es/USERS/irmina/SantisimaInquisicion/index.html Basically I'll place there the instructions of how to attribute, the documents preferrably to be documented ... I'll maintain this project for two weeks, so: - if people doesn't involve in the project and send attributions, then I won't maintain it any longer - if people send thousands and thousands of attributions, then I'll pass it to someone of python.org, because then it'd be a rather official thing. Only I'll maintain it if it has a moderate success. I'd like you a couple of things: - Announce this project as semi-official in comp.lang.python or/and in the python.org announcement page - Declare this aim of collecting info as interest of the SIG, so even after death of SantaInquisicion the idea will be alive. - Support it in any way you may think. ( the definitive support's would be from Guido's, what does he think about the idea ) I think the project will fail ( people are basically lazy ), but it will be a good lesson/precedent for the future. Anyway, the idea is worth a try! Look at SantisimaInquisicion Regards/Saludos Manolo www.ctv.es/USERS/irmina/TeEncontreX.html /texpython.htm www.ctv.es/USERS/irmina/SantisimaInquisicion/index.html Do your part to help preserve life on Earth -- by trying to preserve your own. From da@ski.org Fri Nov 26 18:58:55 1999 From: da@ski.org (David Ascher) Date: Fri, 26 Nov 1999 10:58:55 -0800 (Pacific Standard Time) Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion In-Reply-To: Message-ID: On Fri, 26 Nov 1999, Manuel Gutierrez Algaba wrote: > > I've decided on making a try of project for my proposal, as I > think It's intrinsically good. I've placed at: > > http://www.ctv.es/USERS/irmina/SantisimaInquisicion/index.html > > Basically I'll place there the instructions of how to attribute, Can you explain how to use this website? I've looked at it and at TeEncontreX, and all I seem to do is to click between pages describing the system, but I can't find any real *DOC*. You'll have a hard time getting folks to do anything if you don't give a specific example of what they'll get out of it... How about you do the markup for a given module, and show how it looks? --david From irmina@ctv.es Fri Nov 26 20:20:09 1999 From: irmina@ctv.es (Manuel Gutierrez Algaba) Date: Fri, 26 Nov 1999 20:20:09 +0000 (GMT) Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion In-Reply-To: Message-ID: On Fri, 26 Nov 1999, David Ascher wrote: > > On Fri, 26 Nov 1999, Manuel Gutierrez Algaba wrote: > > > > > I've decided on making a try of project for my proposal, as I > > think It's intrinsically good. I've placed at: > > > > http://www.ctv.es/USERS/irmina/SantisimaInquisicion/index.html > > > > Basically I'll place there the instructions of how to attribute, > > Can you explain how to use this website? I've looked at it and at > TeEncontreX, and all I seem to do is to click between pages describing the > system, but I can't find any real *DOC*. > > You'll have a hard time getting folks to do anything if you don't give a > specific example of what they'll get out of it... > > How about you do the markup for a given module, and show how it looks? http://www.ctv.es/USERS/irmina/SantisimaInquisicion/AutoDeFe.html There you can see clearly two examples,... please let me know if this is not enough. Anyway, it's funny a DOC project that is undocumented :) It's so simple that it's difficult to explain. :) Anyway, if this is not ENOUGH, feel freely to insist and to complain bitterly. If there's enough interest I'll explain it moooree! Regards/Saludos Manolo www.ctv.es/USERS/irmina/TeEncontreX.html /texpython.htm Once you've tried to change the world you find it's a whole bunch easier to change your mind. From da@ski.org Fri Nov 26 22:16:52 1999 From: da@ski.org (David Ascher) Date: Fri, 26 Nov 1999 14:16:52 -0800 (Pacific Standard Time) Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion In-Reply-To: Message-ID: On Fri, 26 Nov 1999, Manuel Gutierrez Algaba wrote: > http://www.ctv.es/USERS/irmina/SantisimaInquisicion/AutoDeFe.html> > There you can see clearly two examples,... please let me know > if this is not enough. I understand the concept of markup. Your markup is basically TeX. Fine for TeX documents, but why in the world do you expect Python users who are used to def foo(): print 'a' to suddenly like \newcommand{\indexCextension}{\index{Cextension}\index{extension}} and why do you think they would even consider adding it to their code? What is the benefit to them? You need to show the *end-result* of indexing, which means hyperlinked TOC's, pretty HTML pages, etc. Warning: rant ahead. Generally, I think that the DOC-sig spends too much time arguing about specific markups and trendy technology (sorry, I'm getting really frustrated at the XML LPHBTSP (that's 'alphabet soup' without vowels), and not enough with the marketing aspect. *If the problem is to encourage average Python coders to markup their docs*, then you need to make it simple *and Python-like*. Define a Pythonic syntax (e.g what Jim Fulton uses in the StructuredText.py module), provide a CGI script which has a "PUT" button which will take a marked-up .py file, creates a hyperlinked TOC for that module, snazzy HTML pages and whatnot, automatically add said module to some centralized repository of 'cool documented modules', and folks *will* learn the markup. Especially if you provide a few modules which show examples of the markup and show how trivial it is. On the other hand, if you put up a page which makes Python code look more like TeX or XML, why in the world do you expect people to bother? POD, Javadoc, and Autoduck work because they do 90% of the job with about 4 minute of learning. That is *all* you can expect of 95% of the programmers out there. Go for the biggest bang for the buck. Enough with the rant. Back to normal DOC-SIG business. --david From uche.ogbuji@fourthought.com Sat Nov 27 01:55:01 1999 From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com) Date: Fri, 26 Nov 1999 18:55:01 -0700 Subject: [Doc-SIG] On David Ascher's Rant In-Reply-To: Your message of "Fri, 26 Nov 1999 14:16:52 PST." Message-ID: <199911270155.SAA00967@localhost.localdomain> David, I confess I don't quite get it, which is too bad because I respect your opinions and would like to know precisely what you're getting at. Maybe the problem is that I'm a doc-sig newbie. I just showed up here a couple of weeks ago to mention that we had a few internal tools at FourThought for python documentation and I wondered if anyone was interested. As such, I don't have any sense that the doc-sig has or is failing in any way. The way I see it, Python has a decent amount and quality of documentation. True it is not as good as Java's or Perl's but it is about as much as can be expected given its age and market profile. Fred Drake has done a phenomenal job. My understanding is that we're all discussing a way to push it to the next level. If possible, it means leap-frogging Perl and Java, but mostly it just means seeking the best solution. I don't see any need for haste or panic. I don't understand the reasoning that python docs should look like Python. I'm not as familiar with POD, but you also give Javadoc as an example, and it looks _nothing_ like Java. Also note that several people have been advocating a Javadoc-like system, including myself. So where is the terrible divergence? XML advocates here are mostly suggesting it for the "library" format of python documentation, not the "author" format. So why does it matter if you think authors bear such distaste for XML and TeX? They won't have to deal with it. The reality, though, is that it's easier to go from XML or TeX to any of the many formats Python users want than it would be from Jim-Fulton-David-Ascher pythonic documentation format. Would you volunteer to write the tools to go from JFDA to *roff for man pages, postscript, PDF, HTML and GNU info? I doubt it, and even if you would, I'd advise against re-inventing the wheel that the Linux Documentation Project has so admirably crafted. The way I see it, your key argument with Manuel's proposals would be that he plans to inflict TeX on Python authors. I agree that that is a bad thing, and I also wouldn't want to inflict XML on Python Authors. I don't think that's an alien sentiment here, but your rant makes it sound that way. So, what am I missing? -- Uche Ogbuji FourThought LLC, IT Consultants uche.ogbuji@fourthought.com (970)481-0805 Software engineering, project management, Intranets and Extranets http://FourThought.com http://OpenTechnology.org From da@ski.org Sat Nov 27 05:13:31 1999 From: da@ski.org (David Ascher) Date: Fri, 26 Nov 1999 21:13:31 -0800 (Pacific Standard Time) Subject: [Doc-SIG] On David Ascher's Rant In-Reply-To: <199911270155.SAA00967@localhost.localdomain> Message-ID: On Fri, 26 Nov 1999 uche.ogbuji@fourthought.com wrote: > David, > > I confess I don't quite get it, which is too bad because I respect your > opinions and would like to know precisely what you're getting at. It's all right -- I wasn't being especially clear. That's the nature of a rant, though, so at least you can't get me on the 'truth in advertising' laws. =) Here's my take on things, hopefully more rationally thought out and more clearly expressed: 1) The current Python documentation is, in my opinion, just fine. I think that moving from LaTeX to something more modern is a great idea, and I think that Fred is doing a beautiful job for what is a thankless task. I have no problem of course with the discussion on that topic. 2) IMHO, the single most problematic aspect of Python documentation is the lack of a standard way for programmers to document their code inside the .py file, unlike e.g. POD., and that is a shame. If nothing else, I think that lacking this standard is one of the reasons for the lack of docstrings. If there was a good standard, then the use of docstrings already made in IDLE and Pythonwin (which is wonderful) could be made even deeper and richer, leading to a snowballing effect. There is at least one proposal to index in-code Python docstrings with TeX-like commands. In my opinion, anything that full of backslashes and braces will never fly in the Python community. 3) Programmers in general, smart programmers especially, try to "think out" all of the possible uses for something before they start to design it. That's why God Invented Managers and deadlines. We need one or the other. > The way I see it, your key argument with Manuel's proposals would be that he > plans to inflict TeX on Python authors. I agree that that is a bad thing, and > I also wouldn't want to inflict XML on Python Authors. I don't think that's > an alien sentiment here, but your rant makes it sound that way. > So, what am I missing? Possibly nothing. I posted in a moment of emotion, which is never a good idea. I apologize for the rant. I suspect that what I was really reacting to was a combination of: - puzzlement as to the motivations for Manuel's proposal, - a personal frustration with seeing design-by-committee lead to inaction - a strong visceral reaction against TeX-style markup in Python. I don't mind XML markup so-much, btw, and I suppose no one else in the HTML age minds much as long as they don't have to mess w/ anything beyond .... As soon as you allow anything beyond the trivial, you lose 50% of the audience. KISS (Keep It Simple, Stupid) rules for these sorts of things. Here's what I'd most like to see in the area of in-code doc (I have no constructive opinion on the 'large' documents markup issue): a definition for a set of entities (if that is the right word) to use in docstrings for modules, classes, functions and methods. After two weeks of discussion *at most*, Fred brings it to Guido, Guido gets his old melted-wax seal and stamps his approval on it, and then we advertise the heck out of it. The code will follow. Straw Proposal 0.1 [da]: """ David Ascher 1.0 20/10/96 This is a module with one function in it. ... """ def len(input): """\ Returns the length of the input sequence The input sequence IntType len, lenth, ln """ ... FWIW, I'm really not sure that the above is significantly easier to parse in the long run than a more Pythonic: """ Author: David Ascher Date: 10/25/99 ... """ def len(input): """\ Description: Returns the length of the input sequence. Arguments: input (sequence) -- The input sequence Return Type: IntType See Also: len, length, ln """ provided we make explicit exactly the format (just like Python syntax is formalized and parseable) and keep the fancy stuff (embedded URLs, hyperlinks, etc.) to a single escape code (e.g. like what Mark Hammond was pushing for months ago -- see June 1998 archives). That said, what I really care most about is a final decision, not the specific markup used. That's why I think that what we need is a Guido Stamp Of Approval. --david 'ranted out' ascher PS: I'll pay for a new melted-wax seal if Guido lost the old one. =) From pf@artcom-gmbh.de Sat Nov 27 09:30:45 1999 From: pf@artcom-gmbh.de (Peter Funk) Date: Sat, 27 Nov 1999 10:30:45 +0100 (MET) Subject: What is important (was Re: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion) In-Reply-To: from David Ascher at "Nov 26, 1999 2:16:52 pm" Message-ID: Hi! David Ascher wrote: [...] > Generally, I think that the DOC-sig spends too much time arguing about > specific markups and trendy technology (sorry, I'm getting really > frustrated at the XML LPHBTSP (that's 'alphabet soup' without vowels), and > not enough with the marketing aspect. > > *If the problem is to encourage average Python coders to markup their > docs*, then you need to make it simple *and Python-like*. Define a > Pythonic syntax (e.g what Jim Fulton uses in the StructuredText.py > module), provide a CGI script which has a "PUT" button which will take a > marked-up .py file, creates a hyperlinked TOC for that module, snazzy HTML > pages and whatnot, automatically add said module to some centralized > repository of 'cool documented modules', and folks *will* learn the > markup. Especially if you provide a few modules which show examples of > the markup and show how trivial it is. > > On the other hand, if you put up a page which makes Python code look more > like TeX or XML, why in the world do you expect people to bother? I agree with David. I've joined the list only recently (some weeks ago). As someone who has yet much to learn, I think that the current documentation for Python and the module library is very good. Unfortunately some very important parts are still missing: Something like Fredrik Lundhs "An Introduction to Tkinter" ---also it is still somewhat incomplete in some regions--- would be a _VERY_ useful addition to the library documentation. Spending time on something like this seems far more important to me than this discussion about the topics introduced in Chapter 8 (Future Directions) of Documenting Python. Regards, Peter -- Peter Funk, Oldenburger Str.86, D-27777 Ganderkesee, Germany, Fax:+49 4222950260 office: +49 421 20419-0 (ArtCom GmbH, Grazer Str.8, D-28359 Bremen) From Manuel Gutierrez Algaba Sat Nov 27 11:27:37 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Sat, 27 Nov 1999 11:27:37 +0000 (GMT) Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion In-Reply-To: Message-ID: On Fri, 26 Nov 1999, David Ascher wrote: > On Fri, 26 Nov 1999, Manuel Gutierrez Algaba wrote: > > > http://www.ctv.es/USERS/irmina/SantisimaInquisicion/AutoDeFe.html> > > > There you can see clearly two examples,... please let me know > > if this is not enough. > > I understand the concept of markup. Your markup is basically TeX. Fine > for TeX documents, but why in the world do you expect Python users who are > used to It happens that TeX documents are the easiest structure to represent things, apart from raw .txt A basically TeX markup is just the easiest markup. > > def foo(): > print 'a' This example is too simple , it gots almost no info at all , perhaps : \indexbasicdeffunction (\index{basic}\index{def}\index{function} ) after doing this , anyone searching for basic things, or for definitions of for functions may find the piece of code you've done, your WORK is PROFITABLE now. That piece of CODE IS VALUABLE for the rest of community. > > to suddenly like > > \newcommand{\indexCextension}{\index{Cextension}\index{extension}} > > and why do you think they would even consider adding it to their code? The answer is the same than: "why do you documentate?" - To understand what you are doing: If you put \indexCextension, in that very same word you're resuming the whole functionality of a piece of code, - you can have your internal vocabulary inside your program, so you can browse by different concepts and how they've been implemented, although this is fairly advanced yet. - people can understand or join into their databases the information you've provided. That is, documentation is that thing used for the people to understand what others have done. > What is the benefit to them? You need to show the *end-result* of > indexing, which means hyperlinked TOC's, pretty HTML pages, etc. The *end-result* is Sacramental.html stuff and the rest. But that's *ONLY* one representation, more than enough, I think for most of the cases. It can be prettier but I've showed the most basic . > > Warning: rant ahead. Warning: more SantisimaInquisicion propaganda ahead !!! > > Generally, I think that the DOC-sig spends too much time arguing about > specific markups and trendy technology (sorry, I'm getting really > frustrated at the XML LPHBTSP (that's 'alphabet soup' without vowels), and > not enough with the marketing aspect. Me too, that's why I'm proposing the most simple stuff, I think. > *If the problem is to encourage average Python coders to markup their > docs*, then you need to make it simple *and Python-like*. Define a The question is stupid markup is going to encourage anything: Imagine this : # name inter # param list # param list def inter(a,b): res = [] for i in a: if i in b: res.append(i) Do you think many people are encourage to behave C-ish with python code. And what kind of info that provides : there's a inter function , and it has two parameters, but that SAYS NOTHING AT ALL about the function itself, so it's useless for anybody searching for a list intersection function! > Pythonic syntax (e.g what Jim Fulton uses in the StructuredText.py Pythonic syntax for a C-ish idea, python deals with functionality and no with type rubbish. > markup. Especially if you provide a few modules which show examples of > the markup and show how trivial it is. I've provided the examples, if people don't do it, it's just because they're too lazy, the marking system I propose can't be easier... > On the other hand, if you put up a page which makes Python code look more > like TeX or XML, why in the world do you expect people to bother? Regretfully, a bit a collateral damages to the code has to be done to the code, even so, TeEncontreX damages are not too much, a few words or lines, here and there. In fact, TeEncontreX markup is the one that needs less writting ( by far) and in the most free form ( by far ),it's rather painfully to be strict , it sounds to me java-ish. Python is ( by far ) the less strict language I know, in fact, tight syntax /indent of python is the only (unnoticed) thing, after wards you enjoy of full power of overloadings eval(), map... Python is not Java, it's much much more different than it seems at first sight. Python deserves something strict in the syntax ( \jiji \indexlkjljk), flexible in the use ( you can use the indexes you want, you can place them wherever and whenever you like! ) and that deals with the real problem: the problem with DOC is to know what the CODE is DOING, not their params, functions. The important thing is the WHAT, let be the HOW for java or for C. Regards/Saludos Manolo www.ctv.es/USERS/irmina /TeEncontreX.html /texpython.htm /SantisimaInquisicion/index.html That, that is, is. That, that is not, is not. That, that is, is not that, that is not. That, that is not, is not that, that is. From Manuel Gutierrez Algaba Sat Nov 27 11:27:48 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Sat, 27 Nov 1999 11:27:48 +0000 (GMT) Subject: [Doc-SIG] On David Ascher's Rant In-Reply-To: <199911270155.SAA00967@localhost.localdomain> Message-ID: On Fri, 26 Nov 1999 uche.ogbuji@fourthought.com wrote: > The way I see it, your key argument with Manuel's proposals would be that he > plans to inflict TeX on Python authors. I agree that that is a bad thing, and > I also wouldn't want to inflict XML on Python Authors. I don't think that's > an alien sentiment here, but your rant makes it sound that way. TeX is not TeEncontreX, TeX is something 100 billion times more complex, TeEncontreX is just three things : - definition of an index (newcommand ) - delimitation of the space which index(es) is used (\jiji \jaja) - use of the indexes ( \indeslkjk) If this is too complex , or if this is TeX, then I'm missing something very BIG. Anyway, I don't want to inflict anything to anybody. For me it's enough to leave in the annals of python the documentation system of the future. This system ( or something similar ) will succeed sooner or later ( probably with somebody more powerful than me) and I'm more than happy to state here : "I was the first". This is indexing but in a non-indexing way: massive and inter-index-collaborative. If I can make this clear enough, or anyone can clearly understand it and want to spend the time, we'll get some years ahead the rest. That's all. If not, well, pity! Manolo www.ctv.es/USERS/irmina /TeEncontreX.html /texpython.htm /SantisimaInquisicion/index.html That, that is, is. That, that is not, is not. That, that is, is not that, that is not. That, that is not, is not that, that is. From sean@digitome.com Sat Nov 27 10:52:28 1999 From: sean@digitome.com (Sean McGrath) Date: Sat, 27 Nov 1999 10:52:28 +0000 Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion In-Reply-To: References: Message-ID: <3.0.6.32.19991127105228.0093c4b0@gpo.iol.ie> [Manuel Gutierrez Algaba] > >It happens that TeX documents are the easiest structure to represent >things, apart from raw .txt A basically TeX markup is just >the easiest markup. > Completely subjective and unsubstantiated statements! >> >> def foo(): >> print 'a' > >This example is too simple , it gots almost no info at all , >perhaps : \indexbasicdeffunction >(\index{basic}\index{def}\index{function} ) >after doing this , anyone searching for basic things, or for >definitions of for functions may find the piece of code you've done, >your WORK is PROFITABLE now. That piece of CODE IS VALUABLE for >the rest of community. In my opinion, there is *NO CHANCE* that any developer would voluntarily add th index stuff in this example isn't it redundant? A text parsing tool that knows the syntax of Python can work out that foo is a function. There is no need for a programmer to spell it out in a second syntactic form right? From mal@lemburg.com Sat Nov 27 12:39:58 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Sat, 27 Nov 1999 13:39:58 +0100 Subject: [Doc-SIG] On David Ascher's Rant References: Message-ID: <383FD11E.D8ADE8D0@lemburg.com> David Ascher wrote: > > 1) The current Python documentation is, in my opinion, just fine. I think > that moving from LaTeX to something more modern is a great idea, and I > think that Fred is doing a beautiful job for what is a thankless task. > I have no problem of course with the discussion on that topic. I second that. > 2) IMHO, the single most problematic aspect of Python documentation is > the lack of a standard way for programmers to document their code > inside the .py file, unlike e.g. POD., and that is a shame. If nothing > else, I think that lacking this standard is one of the reasons for the > lack of docstrings. If there was a good standard, then the use of > docstrings already made in IDLE and Pythonwin (which is wonderful) > could be made even deeper and richer, leading to a snowballing effect. > > There is at least one proposal to index in-code Python docstrings > with TeX-like commands. In my opinion, anything that full of > backslashes and braces will never fly in the Python community. I don't think people will start to write TeX in their docstrings... after all not everyone can read plain TeX and will get pretty confused about all those backslashes and curly brackets. IMHO, a clean plain text approach goes much further; together with some conventions on how to format this text and intelligent tools to extract the information encoded by those conventions will certainly make the writing docstrings much more popular. BTW, in case someone cares, the format I use for docstrings and function/method signature goes as follows: def normlist(jlist, StringType=types.StringType): """ Return a normalized joinlist. All tuples in the joinlist are turned into real strings. The resulting list is a equivalent copy of the joinlist only consisting of strings. """ ... 1. Localizations are split from the true input arguments by an empty line or a comment line 2. The first line in the docstring includes a short description of what the function does. 3. The remaining lines are used for more detailed descriptions. Additional markup e.g. for cross referencing would be nice but shouldn't look awkward. One way to do this would be: a. use .method() for methods of the same class b. use Class.method() for methods of other classes c. use *name for referencing defined names in the current context, e.g. class names, parameter names, module names, etc. d. methods/functions which don't have docstrings shouldn't go into the automatic documentation output (this feature is often forgotten: you may not want to document certain parts of you module for some reason) and so on... > 3) Programmers in general, smart programmers especially, try to "think > out" all of the possible uses for something before they start to design > it. That's why God Invented Managers and deadlines. We need one or > the other. Right. And it's even worse in the Python community: they first try to prove NP-completeness rather than think about good reasonable approaches for the common case. > Straw Proposal 0.1 [da]: > > """ > David Ascher > 1.0 > 20/10/96 > This is a module with one function in it. > ... > """ > > def len(input): > """\ > Returns the length of the input sequence > > The input sequence > > IntType > len, lenth, ln > """ > ... Are you serious about the above ??? Noone is going to write that in his docstrings... > FWIW, I'm really not sure that the above is significantly easier to parse > in the long run than a more Pythonic: > > """ > Author: David Ascher > Date: 10/25/99 > ... > """ > > def len(input): > """\ > Description: > Returns the length of the input sequence. > > Arguments: > input (sequence) -- The input sequence > > Return Type: IntType > > See Also: len, length, ln > """ > > provided we make explicit exactly the format (just like Python syntax is > formalized and parseable) and keep the fancy stuff (embedded URLs, > hyperlinks, etc.) to a single escape code (e.g. like what Mark Hammond was > pushing for months ago -- see June 1998 archives). That said, what I > really care most about is a final decision, not the specific markup used. > That's why I think that what we need is a Guido Stamp Of Approval. Looks fine, but there is one catch: not everyone is going to write his docstrings in English... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 34 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From da@ski.org Sat Nov 27 17:17:37 1999 From: da@ski.org (David Ascher) Date: Sat, 27 Nov 1999 09:17:37 -0800 (Pacific Standard Time) Subject: [Doc-SIG] On David Ascher's Rant In-Reply-To: <383FD11E.D8ADE8D0@lemburg.com> Message-ID: On Sat, 27 Nov 1999, M.-A. Lemburg wrote: > BTW, in case someone cares, the format I use for docstrings and > function/method signature goes as follows: > > def normlist(jlist, > > StringType=types.StringType): > > """ Return a normalized joinlist. > > All tuples in the joinlist are turned into real strings. The > resulting list is a equivalent copy of the joinlist only > consisting of strings. > > """ > ... > > 1. Localizations are split from the true input arguments by > an empty line or a comment line What's a localization? Do you really mean L10N stuff? FWIW, I think that using whitespace in the non-docstring source as a significant delimiter limits things, as it means that the encoding is not readable from the parse tree. > > Straw Proposal 0.1 [da]: > > > > """ > > David Ascher > > 1.0 > > 20/10/96 > > This is a module with one function in it. > > ... > > """ > Are you serious about the above ??? Noone is going to write that > in his docstrings... It's not my favorite, but Uche mentioned that XML-ish syntax is much easier to parse. While I don't really grant that point (or rather I think that the hill needs to be climbed once for all), I want to emphasize: What I really care most about is a final decision, not the specific markup used. > Looks fine, but there is one catch: not everyone is going to > write his docstrings in English... So add another keyword in the module doctring: Language: Francais-France --david From Manuel Gutierrez Algaba Sat Nov 27 19:14:45 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Sat, 27 Nov 1999 19:14:45 +0000 (GMT) Subject: [Doc-SIG] [ANNOUNCE] SantisimaInquisicion In-Reply-To: <3.0.6.32.19991127105228.0093c4b0@gpo.iol.ie> Message-ID: On Sat, 27 Nov 1999, Sean McGrath wrote: > [Manuel Gutierrez Algaba] > > > >It happens that TeX documents are the easiest structure to represent > >things, apart from raw .txt A basically TeX markup is just > >the easiest markup. > > > Completely subjective and unsubstantiated statements! > > >> > >> def foo(): > >> print 'a' > > > >This example is too simple , it gots almost no info at all , > >perhaps : \indexbasicdeffunction > >(\index{basic}\index{def}\index{function} ) > >after doing this , anyone searching for basic things, or for > >definitions of for functions may find the piece of code you've done, > >your WORK is PROFITABLE now. That piece of CODE IS VALUABLE for > >the rest of community. > > In my opinion, there is *NO CHANCE* that any developer would > voluntarily add th index stuff in this example isn't it > redundant? A text parsing tool that knows the syntax of > Python can work out that foo is a function. There is no need > for a programmer to spell it out in a second > syntactic form right? Ufffffff! This example *ONLY* tried to show David Asher that *anything* can be documented/reused/usable using it! Regards/Saludos Manolo www.ctv.es/USERS/irmina /TeEncontreX.html /texpython.htm /SantisimaInquisicion/index.html Disease can be cured; fate is incurable. -- Chinese proverb From Manuel Gutierrez Algaba Sat Nov 27 19:14:57 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Sat, 27 Nov 1999 19:14:57 +0000 (GMT) Subject: [Doc-SIG] On David Ascher's Rant In-Reply-To: <383FD11E.D8ADE8D0@lemburg.com> Message-ID: On Sat, 27 Nov 1999, M.-A. Lemburg wrote: > > 1) The current Python documentation is, in my opinion, just fine. I think It's fine if you read it all of it, NOT FOR SEARCHING, FOR SEARCHING IS NOT FINE. > > There is at least one proposal to index in-code Python docstrings > > with TeX-like commands. In my opinion, anything that full of > > backslashes and braces will never fly in the Python community. > > I don't think people will start to write TeX in their docstrings... > after all not everyone can read plain TeX and will get pretty > confused about all those backslashes and curly brackets. Ok, I think the syntax I proposed is quite bad ( from your comments), instead of \newcommand{\indexalfa}{\index{alfa}} and \indexalfa why not ? <@indexalfa,alfa> and <#alfa> It's the SAME ! SantisimaInquisicion/TeEncontreX is NOT, I say, is NOT, TeX. It was TeX some billion years ago! > > IMHO, a clean plain text approach goes much further; together > with some conventions on how to format this text and intelligent > tools to extract the information encoded by those conventions > will certainly make the writing docstrings much more popular. Two big problems: tight conventions and intelligent tools. It seems to me hard stuff, for use and for programm. > BTW, in case someone cares, the format I use for docstrings and > function/method signature goes as follows: ... > All tuples in the joinlist are turned into real strings. The > resulting list is a equivalent copy of the joinlist only > consisting of strings. > > """ My method can be used for USENET post, FAQ, .py, and *anything* in ASCII form. Yours seem just a signature-teller, that is fine BTW, but it's not the idea I'm proposing, I'm just proposing to focus in the Semantic in the Meaning, in the What ( a function, module, post, whatever...) does. > > 3) Programmers in general, smart programmers especially, try to "think > > out" all of the possible uses for something before they start to design > > it. That's why God Invented Managers and deadlines. We need one or > > the other. > > Right. And it's even worse in the Python community: they first try > to prove NP-completeness rather than think about good reasonable > approaches for the common case. If you spent half an hour, just, attributing your own code or a FAQ with the \indexblabla stuff, you'd be ashtoundingly surprised of : - how fast is it - how powerful/flexible - how much can it help others understand what you've done. It seems to me you don't want to even try to understand my proposal. It's damned simple and direct, but of course, if you don't make the try of thinking/understanding ... then ...! > > Looks fine, but there is one catch: not everyone is going to > write his docstrings in English... My system, by default , can handle any kind of language... Pity that you don't make the try of understanding it. In 10 minutes you'd get the whole functioning of it all! Regards/Saludos Manolo www.ctv.es/USERS/irmina /TeEncontreX.html /texpython.htm /SantisimaInquisicion/index.html Disease can be cured; fate is incurable. -- Chinese proverb From Manuel Gutierrez Algaba Sat Nov 27 19:15:15 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Sat, 27 Nov 1999 19:15:15 +0000 (GMT) Subject: [Doc-SIG] On David Ascher's Rant In-Reply-To: Message-ID: On Sat, 27 Nov 1999, David Ascher wrote: """ \jaja \jiji > > > > > > """ > > > David Ascher > > > 1.0 > > > 20/10/96 > > > This is a module with one function in it. > > > ... > > > """ > > What I really care most about is a final decision, not the specific > markup used. \indexfinaldecision \indexmarkupdiscussion \indexexampleXML \jiji """ My proposal is so flexible that it could live with any other marking. And USENET post and emails can be sorted in source so they'd be reusable. That's why I say it's a kind of XML, because we can reuse if we can handle the info. Please make the try of understanding, I'm proposing something far more powerful than javadoc, It's two orders of magnitude higher level! And we can get set of compatible indexes , because the biggest problem of this is when we get too many indexes, but then It'll be a success, not a real problem. Just imagine, having all comp.lang.python attributed and reusable, I think somebody has made a book doing this, I propose lets make the book ourselves, little by little, not by chapter, but by concepts, by families of concepts, ... a kind of book, but upside-down: the indexes are the chapters. And please forget javadoc-ish stuff, I'm not talking about that. This is much better! Regards/Saludos Manolo www.ctv.es/USERS/irmina /TeEncontreX.html /texpython.htm /SantisimaInquisicion/index.html Disease can be cured; fate is incurable. -- Chinese proverb From irmina@ctv.es Sat Nov 27 19:17:44 1999 From: irmina@ctv.es (Manuel Gutierrez Algaba) Date: Sat, 27 Nov 1999 19:17:44 +0000 (GMT) Subject: [Doc-SIG] Success of TeEncontreX Message-ID: TeEncontreX is registered in freshmeat.net where it has got 769 hits ( webpage ) and 148 (downloads ) in a month. This makes me thing the idea is not bad at all. But the effort in the case of SantisimaInquisicion will be useless unless many people support it. Regards/Saludos Manolo www.ctv.es/USERS/irmina /TeEncontreX.html /texpython.htm /SantisimaInquisicion/index.html What we Are is God's gift to us. What we Become is our gift to God. From da@ski.org Sat Nov 27 22:45:00 1999 From: da@ski.org (David Ascher) Date: Sat, 27 Nov 1999 14:45:00 -0800 (Pacific Standard Time) Subject: [Doc-SIG] On David Ascher's Rant In-Reply-To: Message-ID: On Sat, 27 Nov 1999, Manuel Gutierrez Algaba wrote: > My proposal is so flexible that it could live with any other marking. Manuel, I am somewhat at a loss as to what your proposal is. Can you describe it more precisely, without beliefs such as "it's more powerful" or "This is much better" but rather with a precise definition of exactly what it is you're proposing? Just looking at the website doesn't really help me at least. Are you proposing: 1) a markup syntax (e.g. \newcommand vs vs ...)? 2) a set of tags (e.g. function, extensionmodule, usenetpost, ...)? 3) something else? I gather that all documents can be indexed with your system, and that you do not intend to propose a specific set of indexing 'keywords'. That seems to fly in the face of decades if not hundreds of years of prior art which shown the success of domain-specific keyword lists. Can you try again, please, without hyperbolae? --david From Manuel Gutierrez Algaba Sun Nov 28 12:28:51 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Sun, 28 Nov 1999 12:28:51 +0000 (GMT) Subject: [Doc-SIG] On David Ascher's Rant In-Reply-To: Message-ID: On Sat, 27 Nov 1999, David Ascher wrote: > On Sat, 27 Nov 1999, Manuel Gutierrez Algaba wrote: > > > My proposal is so flexible that it could live with any other marking. > > Manuel, I am somewhat at a loss as to what your proposal is. Can you > describe it more precisely, without beliefs such as "it's more powerful" > or "This is much better" but rather with a precise definition of exactly > what it is you're proposing? Just looking at the website doesn't really > help me at least. > > Are you proposing: > > 1) a markup syntax (e.g. \newcommand vs vs ...)? > 2) a set of tags (e.g. function, extensionmodule, usenetpost, ...)? > 3) something else? > > I gather that all documents can be indexed with your system, and that you > do not intend to propose a specific set of indexing 'keywords'. That > seems to fly in the face of decades if not hundreds of years of prior art > which shown the success of domain-specific keyword lists. Can you try > again, please, without hyperbolae? Hyperbolae are needed when you can't explain anything and then you try to win the hearts instead the minds. In this email I won't mention "simple", "easier" nor "better" nor "best". I'll explain plainly the idea. The current situation in python world is this: - We have humans - We have doc - We have code There's a relationship among humans-doc : Humans want to poke information from docs, in fact, they need the information to produce code, mainly. Because of that, code has got some kind of frozen information ( the one the coder used for building it ). There's a relationship among code-doc: Code can be seen as a thing that may help to produce new code or to understand new code/code, the code itself is a special kind of doc. It's obvious that code can produce new code, but for that we need usually the esence (the doc) of what that code does. We need that esence because IMO code is just an implementation of an information, even in a high level programming language, that code keeps personal preferences or solutions. Those preferences are things like: def inter(a,b): result = [] for i in a: if i in b: result.append() return result There're almost 20 versions of this code, using filter, map, default params, using dicts... For a very specific "idea"/info ( intersection of two list) we can have 20 versions of it. If I doc this function properly ala "javadoc", truly I have info about "one implementation" of an idea.But obviously when reusing code we need two things: look for implementations of the idea we have and adapt it to our code. The idea we are looking for is far more important than the implementation itself, the implementation may be adapted, and in the case of python, the implementation of an idea ( params, code itself) is not so obscure to need a strong doc support ( think about assembler or prolog or perl). Perhaps, you agree now with me in : - that code is basically "frozen implementations" of ideas/info - OO programming involves frequent reuse of code, frequent reuse of ideas. - it's important to deal with ideas - we can't leave details of implementation for later Once we've identified our main target: ideas ( how to reuse and handle) the question is how to do it ? Well, I say: If this code : class.... bla... bla.... bla.... is the implementation of an idea, let's mark it explicitly with: <#idea_A> Once we've marked it we now see if that solves our question: "how to reuse and handle ?" Can I reuse and handle the idea of that code that has been marked with <#idea_A>? This involves : - is idea_A the idea I'm looking for? If yes: then it works If not but it's a similar idea (sorry If I tend to overuse latin words, similar=very close) Then the question is now "what is a similar idea to another?" It seems, that similar (very close) implies that the ideas group themselves into groups. Let's think about it, let's think about a example with sockets : socket, asyncronous, buffering, internet, CGI, server, telnet port, RFC, timeout... Apparently some ideas involve more basic ideas ( telnet port for example), but It seems that basically those ideas are widely general and single-meaning. Is there any field whose ideas are not widely general or single-meaning? Well, fortunately, we're not talking about philosophy but about technical ideas. Can you identify clearly ideas in the code? Lots of times, I guess. But even in those cases that they're not clear or are not very general ideas then It may happen: - you're considering implementation information, not the idea that that code implements - your code solves a non common idea, but even so, that idea will be related with any more common idea. If you've got this far, you see I'm talking about relationship among ideas, not about code. Fred and Paul worry about those relationships, they can be many, they could not be clear. But even so, we won't know until we have the ideas , until we have the problem. Then and even if that problem can't be properly solved We'll have a library of ideas, spread over FAQ, USENET, code, HOWTO, ..., ready to be searched/compared/handled. Library of ideas mean simply a library of doc/code ready to be used to generate new code. Is going people to mark their code? Is it worth the effort ? I guess, they're not. In fact, you can express a broad idea with five lines of doc or with a single <#idea>, this is valid for general ideas. But this requires a change in the mind of people: - concise, direct, high-level. It's the same how is this done ! It's the same \indexbla than <#bla> It's the same if somebody writes \indexsocket or \indexport or \indexcomms. We can relate each other, afterwards. \indexsocket \indextelnet \indexexpect is the same that: \indextelnetexpect We can have tools that unite different notations. See this as a pyramid, the higher you are the less space there's. Is this "indexing" ? No, I use indexes and it seems indexing, but It's a kind of Plato-python-world-building. Libraries of ideas, not libraries of implementations. It don't think it's a good idea to impose a "limited set of keywords", let people express freely, because in fact, in the field we're working (high level ideas ) there's not too much space left. Freedom in this level means "expressing" not "confusion". Of course, "my idea" involves www-web pages ftp , pages of written books,.... anything. This is like modern art, people like XVIII century paintings because they can understand it, but modern art is richer in concepts and information. Is people ready? If they're ready, the place is here: python-world. But, I'm rather pesimistic, people is lazy and brute. I can't give a list of all the possible ideas, nor give the relationships... look at TeEncontreX ( the most basic implementation of that "pyramid") it's easy to handle/search, I guess so. If it'd be 10 times bigger it wouldn't be much harder to handle, but it'd carry 10 times more info !! My idea is a kind of "inverse-video" of Yahoo, and improved of course. Sorry , for the hyperbolae :) :P Regards/Saludos Manolo www.ctv.es/USERS/irmina /TeEncontreX.html /texpython.htm /SantisimaInquisicion/index.html Life can be so tragic -- you're here today and here tomorrow. From mal@lemburg.com Sun Nov 28 22:27:23 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 28 Nov 1999 23:27:23 +0100 Subject: [Doc-SIG] On David Ascher's Rant References: Message-ID: <3841AC4B.3A29C921@lemburg.com> David Ascher wrote: > > On Sat, 27 Nov 1999, M.-A. Lemburg wrote: > > > BTW, in case someone cares, the format I use for docstrings and > > function/method signature goes as follows: > > > > def normlist(jlist, > > > > StringType=types.StringType): > > > > """ Return a normalized joinlist. > > > > All tuples in the joinlist are turned into real strings. The > > resulting list is a equivalent copy of the joinlist only > > consisting of strings. > > > > """ > > ... > > > > 1. Localizations are split from the true input arguments by > > an empty line or a comment line > > What's a localization? Do you really mean L10N stuff? FWIW, I think that > using whitespace in the non-docstring source as a significant delimiter > limits things, as it means that the encoding is not readable from the > parse tree. No, I meant the StringType=types.StringType part: it localizes symbols which would otherwise be looked up in the global name- space. I often do this to speed up routines which deal with static APIs like string.split and string.join, e.g. def f(x, split=string.split,join=string.join): ... The l10n stuff is something which will appear in Python 1.6 -- hopefully that is ;-) > > > Straw Proposal 0.1 [da]: > > > > > > """ > > > David Ascher > > > 1.0 > > > 20/10/96 > > > This is a module with one function in it. > > > ... > > > """ > > > Are you serious about the above ??? Noone is going to write that > > in his docstrings... > > It's not my favorite, but Uche mentioned that XML-ish syntax is much > easier to parse. While I don't really grant that point (or rather I think > that the hill needs to be climbed once for all), I want to emphasize: > > What I really care most about is a final decision, not the specific > markup used. I guess doc strings are just as personal to the programmer as indention or naming styles: you won't get everybody to agree on one way to do it. Besides, I don't think this is really needed: as long as the programmer can provide routines to parse his code everything should be fine. This could e.g. be implemented by subclassing a reader implementation which then passes the parsed tokens to other code processing them for some other use. Of course, you could provide a few standard markup schemes, e.g. an XML one and StructuredText one. > > Looks fine, but there is one catch: not everyone is going to > > write his docstrings in English... > > So add another keyword in the module doctring: > > Language: Francais-France I was referring to "Language:" being English :-) E.g. my doc strings in German would look quite silly if I would insert some English markers in there... But one could of course simply define a few sets of these markers which then get chosen by a command line option -- one for each language. Or perhaps simply look for all of them. Well, anyway, these are just some ideas. I'm not going to code anything or proceed discussing these things. Fred is doing a great job and I'll continue to document my code by hand. Perfect for me ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 33 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From mal@lemburg.com Sun Nov 28 22:16:14 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Sun, 28 Nov 1999 23:16:14 +0100 Subject: [Doc-SIG] On David Ascher's Rant References: Message-ID: <3841A9AE.9D00FEB8@lemburg.com> Manuel Gutierrez Algaba wrote: > > On Sat, 27 Nov 1999, M.-A. Lemburg wrote: > > > 1) The current Python documentation is, in my opinion, just fine. I think > > It's fine if you read it all of it, NOT FOR SEARCHING, FOR SEARCHING > IS NOT FINE. Hmm, not sure I can follow you here: there is a very nice index which helps you pin-point most details and if you use the PDF version you even get full-text search at no extra cost. > > > There is at least one proposal to index in-code Python docstrings > > > with TeX-like commands. In my opinion, anything that full of > > > backslashes and braces will never fly in the Python community. > > > > I don't think people will start to write TeX in their docstrings... > > after all not everyone can read plain TeX and will get pretty > > confused about all those backslashes and curly brackets. > > Ok, I think the syntax I proposed is quite bad ( from your comments), > instead of \newcommand{\indexalfa}{\index{alfa}} and \indexalfa > why not ? > <@indexalfa,alfa> and <#alfa> This would probably make things a little less TeX-like. > It's the SAME ! SantisimaInquisicion/TeEncontreX is NOT, I say, > is NOT, TeX. It was TeX some billion years ago! Well, it sure looks a lot like TeX. Believe me, I've written LaTeX and TeX for many years -- I know that people don't like it. Even I had my troubles with it at first. The syntax simply isn't compatible with human reading habits and this is basically what doc strings are all about: online help. > > IMHO, a clean plain text approach goes much further; together > > with some conventions on how to format this text and intelligent > > tools to extract the information encoded by those conventions > > will certainly make the writing docstrings much more popular. > > Two big problems: tight conventions and intelligent tools. > It seems to me hard stuff, for use and for programm. The conventions need not be too tight. I've been using the ones I mentioned for some time now and incorporated some of it in my doc.py tool (which you can find on my Python Pages). Works fine... for me at least. > > BTW, in case someone cares, the format I use for docstrings and > > function/method signature goes as follows: > ... > > All tuples in the joinlist are turned into real strings. The > > resulting list is a equivalent copy of the joinlist only > > consisting of strings. > > > > """ > > My method can be used for USENET post, FAQ, .py, and *anything* > in ASCII form. Yours seem just a signature-teller, that is fine BTW, > but it's not the idea I'm proposing, Right. The intention is to extract data from python scripts, nothing more. > I'm just proposing to focus in the Semantic in the Meaning, in the > What ( a function, module, post, whatever...) does. > > > > 3) Programmers in general, smart programmers especially, try to "think > > > out" all of the possible uses for something before they start to design > > > it. That's why God Invented Managers and deadlines. We need one or > > > the other. > > > > Right. And it's even worse in the Python community: they first try > > to prove NP-completeness rather than think about good reasonable > > approaches for the common case. > > If you spent half an hour, just, attributing your own code or a > FAQ with the \indexblabla stuff, you'd be ashtoundingly surprised > of : > - how fast is it > - how powerful/flexible > - how much can it help others understand what you've done. > > It seems to me you don't want to even try to understand my proposal. > It's damned simple and direct, but of course, if you don't make > the try of thinking/understanding ... then ...! Of course I have tried to get the idea... from what I understood I can say, that I don't like the syntax you use. The system itself may have its merrits, but the syntax is a bummer, IMHO. Something about the general idea of automatic documentation: I have tried to proceed in that direction a few years ago to document my Python code. From that experience I can say that automatic documentation -- for me at least -- only serves as aid in finding APIs etc. fast during the programming phase. It is not useable as final documentation. All my packages include HTML documentation which is carefully crafted to include all those things which I intend to publish and deliberately leave parts undocumented or only partially documented. This is not easily possible using automatic documentation or other literate programming approaches. The written docs simply are different because they focus on a different intent (and sometimes even a different audience). > > Looks fine, but there is one catch: not everyone is going to > > write his docstrings in English... > > My system, by default , can handle any kind of language... Hmm, what language do "jiji" and "jaja" come from ? \newcommand also sounds very English ;-) How about making up some more programming compatible tags to delimit code from docs, e.g. #doc and #/doc... -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 33 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From d@pobox.com Sun Nov 28 23:02:17 1999 From: d@pobox.com (David Arnold) Date: Mon, 29 Nov 1999 09:02:17 +1000 Subject: [Doc-SIG] On David Ascher's Rant In-Reply-To: Your message of "Fri, 26 Nov 1999 21:13:31 PST." Message-ID: <199911282302.JAA13867@piglet.dstc.edu.au> -->"David" == David Ascher writes: David> IMHO, the single most problematic aspect of Python David> documentation is the lack of a standard way for programmers David> to document their code inside the .py file, unlike e.g. POD., David> and that is a shame. agreed. David> There is at least one proposal to index in-code Python David> docstrings with TeX-like commands. In my opinion, anything David> that full of backslashes and braces will never fly in the David> Python community. are you refering to gendoc/settext from a few years ago? David> Straw Proposal 0.1 [da]: some feedback: - i believe that the use of "special" string variables is more immediately useful, and maybe more "pythonic", than XML in a docstring for module-level stuff like this. eg. __author__ = "David Ascher" __version__ = "$Revision$[11:-2] __date__ = "$Date$ - my personal preference would be for the RFC 822-style "tag: value" format over XML. it's about equal for parsing, but significantly better for humans to read - similarly, i think i'd prefer a simple, punctuation character-based markup over XML for method/function comments. i really find XML annoying to read. David> That said, what I really care most about is a final decision, David> not the specific markup used. agreed. we've been dithering for years, with software being developed to support various proposals, but never really being given the Guido Stamp Of Approval(tm). David> PS: I'll pay for a new melted-wax seal if Guido lost the old David> one. =) i'll chip in a coupla bucks too ;-) d From da@ski.org Mon Nov 29 00:48:54 1999 From: da@ski.org (David Ascher) Date: Sun, 28 Nov 1999 16:48:54 -0800 (Pacific Standard Time) Subject: [Doc-SIG] Re: Docstrings [was: On David Ascher's Rant] In-Reply-To: <199911282302.JAA13867@piglet.dstc.edu.au> Message-ID: (Sorry, but I felt the need to change the Subject line. My wife looked at my inbox and said "you're ranting?") On Mon, 29 Nov 1999, David Arnold wrote: > David> There is at least one proposal to index in-code Python > David> docstrings with TeX-like commands. In my opinion, anything > David> that full of backslashes and braces will never fly in the > David> Python community. > > are you refering to gendoc/settext from a few years ago? No, I was referring to Manuel's proposal, which I obviously misunderstood. I don't recall the gendoc/settext proposal. > - i believe that the use of "special" string variables is more > immediately useful, and maybe more "pythonic", than XML in a > docstring for module-level stuff like this. > eg. > > __author__ = "David Ascher" > __version__ = "$Revision$[11:-2] > __date__ = "$Date$ > - my personal preference would be for the RFC 822-style "tag: value" > format over XML. it's about equal for parsing, but significantly > better for humans to read I think it's important to note that these will still be in strings, and that one should not confuse them with code. Some reactions to this weekend's posts: 1) Manuel's proposal is interesting, but IMHO much broader in scope than what I feel is needed and doable. Indexing ideas is a larger-than-encyclopedic endeavor, and I could but won't argue its impracticality on statistical grounds alone. Suffice it to say that I agree with Manuel that people are lazy and that it won't work. More positively, Manuel, do you agree that the kind of markup that I advocate (a la POD/javadoc) is a subset of yours (in other words that "Author" and "Argument 1" are 'trivial ideas', hence belong to the set of ideas? And that if this 'minimal' markup is used, then you can use the tags too along with the other, higher-level notions that you propose? 2) So far, folks seem to like a 'lightweight' structure for docstrings. (note that by lightweight I do not mean ambiguous or vague [*]). MAL wants to allow multiple formats as long as the person writing the docstring writes a parser for his/her specific format. That's fine with me as long as there is a format which we can assume is readable by default without *having* to write such a parser. Uche has mentioned that he 1) agrees that XML shouldn't be imposed on Python Authors, and that 2) XML is easier to parse than StructuredText. While I grant him both, I'd like his reaction to this specific point. I would like to claim that we can define a format which is - easily learned - easily parsed and debugged - rich enough - extensible enough - pleasing to the eye I will make a concrete proposal in a seperate message titled "docstring grammar". 3) The i18n issue is IMHO a red herring. We can allow a 'keyword renaming' facility so that I can start a module with: import docstring docstring.set_language('Francais', 'Quebec') and then I can use Quebecois keywords in the doc, as long as there was a table mapping the default (US-English) keywords to the Quebecois keywords. Would that be OK with you, Marc-Andre? After all, 'import' is an English word, and no one complains about that. I once programmed in a version of Basic where the keywords were translated in French, and I can testify that it was a massive failure. --david [*]: I have found StructuredText as implemented in StructuredText.py to be somewhat vague and non-trivial to use to produce exactly formatted doc. This is probably due to its bigger aims than what I have in mind for this. From da@ski.org Mon Nov 29 00:57:03 1999 From: da@ski.org (David Ascher) Date: Sun, 28 Nov 1999 16:57:03 -0800 (Pacific Standard Time) Subject: [Doc-SIG] docstring grammar Message-ID: Proposed format for docstrings: The whitespace at the beginning of a docstring is ignored. Paragraphs are separated by one or more blank lines. For compatibility with Guido, IDLE and Pythonwin (and increasing the likelihood that the proposal will be accepted by GvR), the docstrings of callables must follow the following convention established in Python's builtins: >>> print len.__doc__ len(object) -> integer Return the number of items of a sequence or mapping. In other words, the first paragraph must fit on a line, repeat the name of the callable, with a 'wordy' signature, the ' -> ' string, and the type of the return value. The second paragraph must be a one-sentence description of the callable. It is also allowed to have those two bits separated by a " -- " string: >>> print [].pop.__doc__ L.pop([index]) -> item -- remove and return item at index (default last) and functions which don't return anything can omit the " -> foo" bit: L.append(object) -- append object to end Each paragraph is either 'text' or a 'keyword-tagged block'. A keyword is a case-sensitive element of [a-zA-Z_]+ followed by two colons (with optional whitespace between the keyword and the colons, but no whitespace allowed between the two colons). A paragraph which doesn't start with a keyword is 'text'. Characters between # signs and the end of the line are stripped by the docstring parser. A 'keyword-tagged block' is nested much like Python code. Just like in Python, the block can either be on the same line as the keyword if it is one-line long (I'll refer to such blocks as 'text' blocks even though they aren't in visual paragraphs), or needs to be indented relative to the keyword. Examples: Author:: Guido van Rossum # comments are stripped Date_of_release :: 1/1/1999 # The key is "Date_of_release" and the # whitespace before the : is stripped Contributors:: # The value is a block of lines. John Doe Ronald Reagan Francois Mitterand Some keywords can have special parsing rules, as the block of text which the keyword designates is well-specified by the rules above. The first example of such a keyword-specific parsing rule is for Arguments: Arguments:: self -- instance input (sequence) -- the sequence which is being processed (the specific syntax of Arguments:: is left for a later discussion). Other candidates which can impose specific parsing rules are: ReturnType, Date, Version, etc. Text blocks can be followed by indented blocks as well -- those are 'children' blocks of the outdented block. 'text' blocks which start with * or - are tagged as 'bullet items' for rendering. The bullet marker has to be consistent within a given level of indentation. Example: * this is one bullet - this is a sub-bullet - this is another sub-bullet * this is another bullet In text blocks, some strings are recognized as links: .foo in the docstring of a class will refer to the foo attribute of the class. In the docstring of a method, it will refer to the foo attribute of the method's class. In the docstring of a module it will refer to a function or class defined in that module foo.bar will refer to the bar attribute of foo, which will be looked up in the following namespaces in order: (to be determined) URL notation is automatically recognized. [foo] refers to the keyword 'foo' in the section 'References' of the current docstring. [..] links cannot span multiple lines or contain whitespaces (as keywords can't). (in other words, if a [ is not matched by a ] in the same line or before a whitespace character is hit, then it is a syntax error. References:: foo:: My Dissertation, University Press, 1902 The set of keywords which are 'officially sanctioned' is: For module docstrings: [see Trove discussion for a good starting set -- this discussion has been had!] For class docstrings: [To be determined] For method docstrings: [To be determined] For function docstrings: [To be determined] Miscellaneous Thoughts: I chose double-colon notation for keywords so that one can have text paragraphs which match the 'word:' notation without having them be interpreted as keywords. Does this proposal make docstrings whitespace-heavy -- the requirement to break each paragraph with a line of whitespace means that a lot of lines are blank, especially when doing 'bulleted lists' The above was (quickly) written with parsing in mind. Is it really easily parseable? If not, what needs to be changed so that it is parseable? I also wanted to make sure that syntax errors could be flagged early and 'localized' for aid in debugging. I'm not sure that I did that carefully enough. Are there normal uses in docstrings where one wants to turn off the automatic link detection? Is there value in having string interpolation? David Arnold mentioned __version__ = "$Revision$[11:-2] __date__ = "$Date$ which raises some issues. I don't think that having [11:-2] evaluated by the docstring parser is a wise idea. However, I can imagine that the module author could do: __version__ = "$Revision$"[11:-2] in the Python code, and then Version:: %(__version__)s in the docstring and that such a simple string interpolation mechanism could have value. I'm not sure it's worth the complication though. What dictionary would be used to do the interpolation? Hopefully constructively, --david PS: It goes without saying that while I railed against design by committee, I am of course hopeful for feedback, for technical reasons (dummy, you forgot special cases X, Y and Z!) and because I realize that a standards proposal needs at least broad agreement if not consensus to be effective in the long run. The sharper-eyed will note that I stacked the deck in my favor in the above proposal by including what Guido does naturally as valid in the proposed grammar. From jack@oratrix.nl Mon Nov 29 09:41:38 1999 From: jack@oratrix.nl (Jack Jansen) Date: Mon, 29 Nov 1999 10:41:38 +0100 Subject: [Doc-SIG] docstring grammar In-Reply-To: Message by David Ascher , Sun, 28 Nov 1999 16:57:03 -0800 (Pacific Standard Time) , Message-ID: <19991129094139.05ADF370CF2@snelboot.oratrix.nl> Very nice proposal! > I chose double-colon notation for keywords so that one can have text > paragraphs which match the 'word:' notation without having them be > interpreted as keywords. If you can get rid of this, and use single colon in stead, I would be 100% happy. As most of the keywords are fixed (only in the References section could I find user-defined keywords) this should be doable. And it would make the document that little bit more readable. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++ www.oratrix.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm From tony@lsl.co.uk Mon Nov 29 09:50:49 1999 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Mon, 29 Nov 1999 09:50:49 -0000 Subject: [Doc-SIG] docstring grammar In-Reply-To: Message-ID: <000701bf3a4f$375a85d0$f0c809c0@lslp7o.lsl.co.uk> I would *love* to see a standard for doc strings, and although I've often objected to specific proposals in the past, by now I'd take almost anything. Well, no, that's NEVER true, but David's proposal doesn't cause *too* many knee-jerk reactions... David Ascher wrote: > Paragraphs are separated by one or more blank lines. As you say later on, I think this does cause some over-use of whitespace... > Characters between # signs and the end of the line are stripped by > the docstring parser. This is a Bad Thing - I have quite often needed to discuss things in doc strings which include use of the "#" character - not least if I'm parsing a little language that uses "#" as its comment character! So losing stuff thus would be difficult. Either (a) why do we need comments in doc strings, or (b) provide a way to escape the "#" character. (Also, if one were using Tim Peter's "test using the doc string as template" thingy, one needs to be able to put generic Python code in the doc strings, and that means that stopping comment characters from going through to the ultimate documentation may be a bad thing.) > A 'keyword-tagged block' is nested much like Python code. Just like > in Python, the block can either be on the same line as the keyword > if it is one-line long I *like* this. > Contributors:: # The value is a block of lines. > > John Doe > > Ronald Reagan > > Francois Mitterand but the above gets oververbose. I suppose one could instead use a list syntax: Contributors:: - John Doe - Ronald Reagan - Francois Mitterand since I don't see the ambiguity in allowing the omission of the vertical whitespace here, *if* one allows that some care would be needed with hyphenation! (i.e., one can't allow one's hyphens to start a line, which is awkward but probably not too bad). Another possibility might be to allow "Python list" syntax - I started off disliking this, but over the last few minutes it has grown on me: Contributors:: [ John Doe, Ronald Reagan, Francois Mitterand ] (again, highjacking Python's syntax). > Text blocks can be followed by indented blocks as well -- those are > 'children' blocks of the outdented block. And this solves my "I want a list item to have multiple paragraphs" problem, which has been a bugbear of mine in the past with other proposals... The exact indentation of a second paragraph in a list item (whether aligned with the bullet or the text) would need addressing later, but I don't much care (provided it is with the text, of course). > 'text' blocks which start with * or - are tagged as 'bullet items' > for rendering. The bullet marker has to be consistent within a > given level of indentation. > > Example: > > * this is one bullet > > - this is a sub-bullet > > - this is another sub-bullet > > * this is another bullet Again, sometimes I'd like to allow the blank lines to be missing. Another way to do this is to have a "special" character to introduce the bullet items - so maybe instead: Example: @* this is one bullet @- this is a sub-bullet but that's horrible in its own way - maybe the white space is just what we have to live with (I certainly WOULD live with it if it was the only thing standing in the way of adopting the proposal!). No, on thinking about it, I would vote for either: 1) use of white space as David proposes (pro: utter simplicity, con: doesn't quite look as nice as I'd like) 2) allow Python list syntax (pro: emphasises this is for short lists, con: a bit odd) 3) detect bullet characters at the "start of line" (pro: still fairly simple, con: one has to take care about, e.g., dashes in text) Ah - I just realised that negative numbers at the start of a line probably kill that one... Could we do numbered/lettered/named lists by, for instance: *1 This list item is numbered, and one expects all items at this indentation in this list to be numbered -a Ditto for "lettered" items in this list @fred And this sub-list has item names -2 This may well get flagged as a mistake *B Unless we're allowing the author to do odd things if they like... (is that simple enough?) > Is there value in having string interpolation? David Arnold mentioned > > __version__ = "$Revision$[11:-2] > __date__ = "$Date$ There's also a semi-convention I've seen where a module's doc string is also used as its documentation for Unix commands, and one substitutes in sys.argv[0] - i.e., the command used to invoke the script - as a string into the "Usage:" line. It's a rather hacky trick, and perhaps not to worry about too much. > The sharper-eyed will note that I stacked the > deck in my favor in the above proposal by including what Guido does > naturally as valid in the proposed grammar. Yea, go for it! desparately hoping this will get off the ground, but with no time to do anything more than comment on it, Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.demon.co.uk/ 2 wheels good + 2 wheels good = 4 wheels good? 3 wheels good + 2 wheels good = 5 wheels better? My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From Edward Welbourne Mon Nov 29 11:42:48 1999 From: Edward Welbourne (Edward Welbourne) Date: Mon, 29 Nov 1999 11:42:48 +0000 Subject: [Doc-SIG] On David Ascher's Rant In-Reply-To: <383FD11E.D8ADE8D0@lemburg.com> References: <383FD11E.D8ADE8D0@lemburg.com> Message-ID: >> 1) ... moving from LaTeX ... is a great idea ... Fred is doing a >> beautiful job ... thankless task. > I second that. Yup, one problem with silent majorities is a failure to say Thank You. > I don't think people will start to write TeX in their docstrings... Nope. Even those of us with fond memories of it. > IMHO, a clean plain text approach ... and in my downright intemperate and opinionated arrogance there can be no decent documentation format *but* plain text. Everything else just leads to mess, confusion and demands for `extensions' to support things that ... oh sod it, you're all smart enough to understand the analogy: the road to Hell is paved with good intentions. > BTW, in case someone cares Yay, another doc format ;*} It contributes to the pool of fragments that'll go into the final recipe and I bet Marc-Andre can write a trivial tool that parses *his* format into whatever we settle on, so won't mind a bit if it doesn't look like it. Each of us can handle our own transitions just as soon as we agree on a common target ... > ... they first try to prove NP-completeness rather than ... no, it's worse than that - we try to work out what would be needed for Turing completeness while trying to keep it straightforward but are so busy thinking about whether we can prove NP-complete that we end up digressing indefinitely. > Are you serious about the above ??? Of course he was - it was a perfectly serious straw man (and so well designed for knocking down that you'd done it before you needed to). He's showing you how bad it would all look if we did things that way. I actually *like* HTML and used it in my doc strings for a while, but it just looked wrong and ugly and it was cumbersome and just plain *not* the right answer. Why do folk have to say so much during the weekend when I'm not looking ? Eventually I'll catch up with this proposal of David's that Tony says is further down the list ... Oh, and for the record, I refuse to accept David's apology for the rant. I can't accept an apology for something and a) be grateful for b) admire it at the same time, Eddy. From mal@lemburg.com Mon Nov 29 12:06:34 1999 From: mal@lemburg.com (M.-A. Lemburg) Date: Mon, 29 Nov 1999 13:06:34 +0100 Subject: [Doc-SIG] docstring grammar References: Message-ID: <38426C4A.ACC76AB5@lemburg.com> David Ascher wrote: > > Proposed format for docstrings: > ... > Is there value in having string interpolation? David Arnold mentioned > > __version__ = "$Revision$[11:-2] > __date__ = "$Date$ > > which raises some issues. I don't think that having [11:-2] > evaluated by the docstring parser is a wise idea. However, I can > imagine that the module author could do: > > __version__ = "$Revision$"[11:-2] > > in the Python code, and then > > Version:: %(__version__)s > > in the docstring and that such a simple string interpolation > mechanism could have value. I'm not sure it's worth the > complication though. What dictionary would be used to do the > interpolation? This raises the question of whether to parse or evaluate the loaded module. Evaluation has the benefit of providing "automatic" context, i.e. the symbols defined in the global namespace are exactly the ones relevant for class definitions, etc. It probably makes contruction of interdepence graphs a lot easier to write. On the downside you have unwanted side effects due to loading different modules. Some notes on the proposal: · Mentioning the function/method signature is ok, but sometimes not needed since e.g. the byte code has enough information to deduce the signature from it. This is not true for builtin function which is probably the reason for all builtin doc strings to include the signature. · I would extend the reference scheme to a lookup in the module globals in case the local one (in the Reference section) fails. You could then write e.g. "For details see the [string] module." and the doc tool would then generate some hyperlink to the string module provided the string module is loaded into the global namespace. · Standard symbols like __version__ could be included and used by the doc tool per default without the user specifying any special "Version:: %(__version__)s" % globals() tags. BTW, for some code which does online formatting of the doc strings, have a look at my hack.py script. It includes a function called docs() which prints out all the information it can find on the given target object. Here's an example: >>> docs(string.upper) upper : upper(s) -> string Return a copy of the string s converted to uppercase. >>> docs(string.zfill) zfill(x, width) : zfill(x, width) -> string Pad a numeric string x with zeros on the left, to fill a field of the specified width. The string x is never truncated. -- Marc-Andre Lemburg ______________________________________________________________________ Y2000: 32 days left Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/ From Edward Welbourne Mon Nov 29 12:48:16 1999 From: Edward Welbourne (Edward Welbourne) Date: Mon, 29 Nov 1999 12:48:16 +0000 Subject: [Doc-SIG] docstring grammar In-Reply-To: <38426C4A.ACC76AB5@lemburg.com> References: <38426C4A.ACC76AB5@lemburg.com> Message-ID: MAL said: 7 I would extend the reference scheme to a lookup in the module globals in case the local one (in the Reference section) fails. You could then write e.g. "For details see the [string] module." and the doc tool would then generate some hyperlink to the string module provided the string module is loaded into the global namespace. We have, it occurs to me, another important namespace: unimported modules. Thus the string module doesn't import re, I assume, but may wish to refer to it (e.g. to say `this function is a cheap variant of the eponymous one in re') in its doc-strings. Fortunately, we also have a handy name to hang this namespace off (which can't coincide with a name in either of our namespaces): import. Thus: `this function is a cheap variant of import.re.search' could be sensible in doc strings. Note, however, that some bypassing of this may be achieved using the [blah] notation (which is good). I have a problem with too much vertical white space, but I believe the perturbations Tibs suggested (and which match what's in gendoc / pythondoc - if my memory isn't disserving me again - so must be feasible) suffice to deal with that. I can make my editor window more than a hundred columns wide if I want, and know that code lines jutting past that are too long; but I still only get 55 lines in sight at the same time, and real code often involves wanting to see more than that. This situation gets badly exacerbated by being obliged to throw gratuitous blank lines (though not as much as by my tendency to verbosity). But, like Tibs, I can live with the vspace if I must. What happened to gendoc / pythondoc ? Eddy. From mhammond@skippinet.com.au Mon Nov 29 12:57:41 1999 From: mhammond@skippinet.com.au (Mark Hammond) Date: Mon, 29 Nov 1999 23:57:41 +1100 Subject: [Doc-SIG] On David Ascher's Rant In-Reply-To: Message-ID: <004101bf3a69$535a09d0$0501a8c0@bobcat> > >> 1) ... moving from LaTeX ... is a great idea ... Fred is doing a > >> beautiful job ... thankless task. > > I second that. > Yup, one problem with silent majorities is a failure to say Thank You. Me too - thanks Fred! The doc is excellent and a thankless task! Mark. From Manuel Gutierrez Algaba Mon Nov 29 16:30:00 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Mon, 29 Nov 1999 16:30:00 +0000 (GMT) Subject: [Doc-SIG] Lisp oriented docstrings (Manolo's strikes back. ) Message-ID: Pity, pity and pity that you don't like the "python-encyclopedia" idea, anyway... \indexauthor is a single idea, unnatributed, so David is not the same that : "Author: Walt Disney" But anyway, If you really want to mess with lowlevel stuff and that still is interfering for higher things then I'll propose the definitive answer to this nasty stuff of Authors,params and so on and so on. The problem seems how to stablish the low level stuff: spaces, syntax, colons,... Really nasty in my opinion. Python is not java, and because of that we have the glorious list and the glorious eval. Let's use them! def function_A( list_A, list_B ): """ author('Guido van Rossum').date('1/1/1999').\ contributors(["John Doe"]).santi(['\indexpollo', '\indexrojo'], radius = 6).arg(1, []).arg(2,[]) """ Advantages of this approach: - Needs no parsing! Just eval(function_A.__doc__) - No low level details - Automatical i18n - You can use default arguments ( author ) for all the functions of a module. - If you know python you know how to write the docs - All the flexibility and power of python - Nice syntax ( if you like python, of course !) - Absolute extensibility and freedom of use ( you can make default what you want, and omit what you want ). author, date .... will return an object, lets name it : Bandurria. The Bandurria objects get more and more attributions and it can be affected by global switches when generating the final doc. def author(self, name): b = Bandurria() b.name(name) return b the same with date ... class Bandurria: def author(self, name): self.name = name return self the same with date ... I hope you understand it the first time, if so, let's approve it and let's face the real interesting thing: SantisimaInquisicion Yes, MA Lemburgh, Jaja is Spanish and Jiji too, They're the sounds of the laughter! But , that's low level stuff, not interesting at all! The eagle seems small when flying high, but in fact when it stands in the ground is a really big animal. Regards/Saludos Manolo www.ctv.es/USERS/irmina /TeEncontreX.html /texpython.htm /SantisimaInquisicion/index.html If something has not yet gone wrong then it would ultimately have been beneficial for it to go wrong. From Manuel Gutierrez Algaba Mon Nov 29 16:30:13 1999 From: Manuel Gutierrez Algaba (Manuel Gutierrez Algaba) Date: Mon, 29 Nov 1999 16:30:13 +0000 (GMT) Subject: [Doc-SIG] Re: Docstrings [was: On David Ascher's Rant] In-Reply-To: Message-ID: On Sun, 28 Nov 1999, David Ascher wrote: > Some reactions to this weekend's posts: > > 1) Manuel's proposal is interesting, but IMHO much broader in scope than > what I feel is needed and doable. Indexing ideas is a > larger-than-encyclopedic endeavor, and I could but won't argue its > impracticality on statistical grounds alone. Suffice it to say that I > agree with Manuel that people are lazy and that it won't work. But, if we can do it, then that'll be great!!! > More positively, Manuel, do you agree that the kind of markup that I > advocate (a la POD/javadoc) is a subset of yours (in other words that > "Author" and "Argument 1" are 'trivial ideas', hence belong to the set > of ideas? And that if this 'minimal' markup is used, then you can use > the tags too along with the other, higher-level notions that you > propose? That's not my original idea, but if we can handle indexes with attributions, simple ideas when attributions, then yes! The simplest syntax is just indexes, then indexes with attributes, then indexes with complex attributes, then XML-ish stuff Regards/Saludos Manolo www.ctv.es/USERS/irmina /TeEncontreX.html /texpython.htm /SantisimaInquisicion/index.html If something has not yet gone wrong then it would ultimately have been beneficial for it to go wrong. From Edward Welbourne Mon Nov 29 17:46:32 1999 From: Edward Welbourne (Edward Welbourne) Date: Mon, 29 Nov 1999 17:46:32 +0000 Subject: [Doc-SIG] docstring grammar In-Reply-To: References: Message-ID: Manuel: if David includes `Keyword::' in his bits and pieces, would Keyword:: indexing keyword data retrieval searching (within a doc-string) contain the information you've been wanting to take out of your \indexaboutindexing \indexaboutkeyword \indexretrieval \indexsearching etc. (with apologies for not having followed your system well enough to mimic the names you'd actually use) ? What I've understood of your scheme appears to tell me the answer Yes. If so, I guess you could just slurp the Keyword slice out of a namespace-tree generated from doc-strings and, I suspect, happiness would abound and confusion abate. I know you have bits that define an indexing command that expands to several indexing commands, which this lacks: but could the same effect be arrived at by turning your set of indexing command definitions into an `expert system' that expands some keywords ? And ... to folk who know about the state of the craft of indexing: is there a better way to go with this ? After all, I'm pretty much just borrowing from one of HTML's META tags here ... Now, back to the spec itself: > For compatibility with Guido, IDLE ... > len(object) -> integer i.e. docstring-startline: archetypical-call [ '->' return ] ['--' summary ] Quite apart from compatibility - this is a *good* approach. I guess that could be why Guido does it ... > Each paragraph is either 'text' or a 'keyword-tagged block'. Sounds good. Flesh and skeleton. I'm with Tibs on the #-comment stuff - particularly the liberty to simply embed a piece of python code in a doc string. > A 'keyword-tagged block' is nested much like Python code. Yes, thank you very much, beautiful - this will give us scope for nested sub-structures in the keyword-tagged data: in particular, get rid of that Date_of_release ... use Author:: David Ascher Release:: Date:: 1999/11/28 Name:: post-gendoc-0.1 Stability:: draft etc. I was initially confused about : or :: because your examples began with the first keyword I'd thought of, namely Example, and only used one : with that one, going on to :: for the rest - then I noticed that you weren't offering it as an example keyword but using it to introduce your list of examples. While I would far sooner have only one :, those of us advocating this need to watch for the danger that the parser will get similarly confused between the author's use of `Example:' in the manner of English idiom and in its keyword sense (and, of course, it isn't the only word to worry about). (The flip-side is: I can see myself getting irritated by the need to say Example:: as a keyword immediately after I've ended a paragraph with the word example ...) Note: this keyword representation is isomorphic to XML via `the usual' equivalences between (pythonic) indentation-structuring and the begin-end style of structuring that C and XML use. keyword: single-liner -> single-liner keyword: indent block dedent -> block (possibly transformed down a bit itself) > Some keywords can have special parsing rules, coo, context-sensitive parsing ;^) Good idea. Lets some things only be keywords where they need to be ... > The above was (quickly) written with parsing in mind. Is it really > easily parseable? If not, what needs to be changed so that it is > parseable? Well, the bulleting (and descriptive list stuff) has been explored already in pythondoc / gendoc, so clearly it's all `within scope'. Heh. And between David and Tibs, surely we have the parsing technology ... On the subject of vertical space ... I'd guess the parser won't need a blank line between * the end of a paragraph and * the start of its first indented subordinate ? Though, indeed, I do want to take out the other blank line here, and I thought gendoc managed that ... > Is there value in having string interpolation? Yes. Definitely. I hadn't realised it was possible until you mentioned it, now I'm sure it's Needed. > Hopefully constructively, having had some time to think on it, I'd say Thoroughly so. Hierarchical namespaces, Context-sensitive parsing, Mappable to XML but written like python, Scope for indexing, and for arbitrary extension within sub-namespaces, Conformance to the only important standard (Guido's de facto habits ;^) Proposed by someone who knows how to write parsers ... No need for the run-time system to bother with any of it (all hidden inside the doc string) Thank you David, Eddy. -- PS - David: you do realise, though, that the committee won't keep up the momentum on this unless you ruthlessly play Gdo until he joins in ... From da@ski.org Mon Nov 29 17:51:59 1999 From: da@ski.org (David Ascher) Date: Mon, 29 Nov 1999 09:51:59 -0800 (Pacific Standard Time) Subject: [Doc-SIG] docstring grammar In-Reply-To: Message-ID: On Mon, 29 Nov 1999, Edward Welbourne wrote: > I'm with Tibs on the #-comment stuff - particularly the liberty to > simply embed a piece of python code in a doc string. Agreed. I am removing that bit about ignoring #'ed text from my proposal. > I was initially confused about : or :: because your examples began with > the first keyword I'd thought of, namely Example, and only used one : > with that one, going on to :: for the rest - then I noticed that you > weren't offering it as an example keyword but using it to introduce your > list of examples. While I would far sooner have only one :, those of us > advocating this need to watch for the danger that the parser will get > similarly confused between the author's use of `Example:' in the manner > of English idiom and in its keyword sense. After a little thought, I'm tempted to remove the :: requirement as well. In my proposal, I think that using the : after Example was a mistake in style. If it was a heading then it should just be text w/o a colon. If it was supposed to be more of a sentence then it should have been spelled out, as in: For example, we can have: The *intent* was, however, to avoid the 'danger' you note above. I'm still open to go either way, "safe" or "comfortable". I forgot two markups: *this* is bold and _this_ is italic. Bold and italic markups must begin and end within a paragraph (I'd say 'within a sentence' but I don't want to complicate the parser with a sentence type). No space allowed between *'s and _'s and their contents. > On the subject of vertical space ... I'd guess the parser won't need a > blank line between > * the end of a paragraph and > > * the start of its first indented subordinate ? > > Though, indeed, I do want to take out the other blank line here, and I > thought gendoc managed that ... By all means, we should borrow from gendoc if it's already solved those issues. I admit not to having looked deeply into gendoc. I'll look into this some more a bit later. > Proposed by someone who knows how to write parsers ... Uh? Me? No way. You must be confusing me with someone else! --david From da@ski.org Mon Nov 29 18:31:05 1999 From: da@ski.org (David Ascher) Date: Mon, 29 Nov 1999 10:31:05 -0800 (Pacific Standard Time) Subject: [Doc-SIG] docstring grammar In-Reply-To: <000701bf3a4f$375a85d0$f0c809c0@lslp7o.lsl.co.uk> Message-ID: On Mon, 29 Nov 1999, Tony J Ibbs (Tibs) wrote: > > Characters between # signs and the end of the line are stripped by > > the docstring parser. > > This is a Bad Thing - I have quite often needed to discuss things in doc As I mentioned in another email, yes, you're right. > (Also, if one were using Tim Peter's "test using the doc string as template" > thingy, one needs to be able to put generic Python code in the doc strings, > and that means that stopping comment characters from going through to the > ultimate documentation may be a bad thing.) This raises a deeper issue: introducing Python code in a docstring. Such text cannot be parsed like text because linebreaks, indentation etc. are important. Here's one idea which I like -- introduce a new keyword which is the equivalent of HTML's
 tag:

  Code:

    def foo(): ...
       return ...

In other words, Python code is just another kind of text, but the
processing rules applied to that block are different. The only restriction
is that the text in a Code: block *cannot* be outdented more than the
first line in the block.  The rendering in HTML would omit the label
"Code:" and instead change font to the monospace font or whatnot.

One related comment:  multiple instances of a given keyword can occur
within a docstring.

> [... on the issue of how to 'shorten' lists... ]
>
> No, on thinking about it, I would vote for either:
> 
> 	1) use of white space as David proposes
> 	   (pro: utter simplicity,
> 	    con: doesn't quite look as nice as I'd like)
> 	2) allow Python list syntax
> 	   (pro: emphasises this is for short lists,
> 	    con: a bit odd)
> 	3) detect bullet characters at the "start of line"
> 	   (pro: still fairly simple,
> 	    con: one has to take care about, e.g., dashes in text)
> 	   Ah - I just realised that negative numbers at the start of a line
> 	   probably kill that one...

How about another keyword?

  List:
     * foo
     * bar
     * spam

Again, such keywords would not be rendered in 'output formats' (HTML, PS,
etc.).

> There's also a semi-convention I've seen where a module's doc string is also
> used as its documentation for Unix commands, and one substitutes in
> sys.argv[0] - i.e., the command used to invoke the script - as a string into
> the "Usage:" line. It's a rather hacky trick, and perhaps not to worry about
> too much.

I'd rather leave that to the coder who does the if __name__ == '__main__'
code.  sys.argv is a runtime-built construct, and I think docstrings
should be dependent on compile-time information only.

--david



From mal@lemburg.com  Mon Nov 29 18:29:08 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Nov 1999 19:29:08 +0100
Subject: [Doc-SIG] docstring grammar
References: 
Message-ID: <3842C5F4.91B2C4BB@lemburg.com>

David Ascher wrote:
> 
> > I was initially confused about : or :: because your examples began with
> > the first keyword I'd thought of, namely Example, and only used one :
> > with that one, going on to :: for the rest - then I noticed that you
> > weren't offering it as an example keyword but using it to introduce your
> > list of examples.  While I would far sooner have only one :, those of us
> > advocating this need to watch for the danger that the parser will get
> > similarly confused between the author's use of `Example:' in the manner
> > of English idiom and in its keyword sense.
> 
> After a little thought, I'm tempted to remove the :: requirement as well.
> In my proposal, I think that using the : after Example was a mistake in
> style.  If it was a heading then it should just be text w/o a colon. If it
> was supposed to be more of a sentence then it should have been spelled
> out, as in:
> 
>    For example, we can have:
> 
> The *intent* was, however, to avoid the 'danger' you note above.  I'm
> still open to go either way, "safe" or "comfortable".

I'd suggest using '^ *[a-zA-Z_]+[a-zA-Z_0-9]*: *' as RE for
keywords, i.e. keywords are Python identifiers immediatly followed
by a colon starting a line of a doc string. That should avoid
most complications, I guess.

	For example: blablablba
and
	...long sentence..., for
	example :

would not be parsed as keywords, while

	Example: a=1;b=2

does fit the above definition (I don't see a problem with including
examples in the parsed sections, BTW... examples are often much
more intuitive to understand than complex definitions).

Something else:

How would the following be handled:

Arguments: file -- a file like object
	   mode -- file mode indicator as defined in [__builtin__.open]
Arguments: buffersize -- optional buffer size in bytes

that is, what happens if a keyword appears twice ? In the above
case an error should be raised, but sometimes this may be
useful:

Example:
	first multi-line example

Example:
	second multi-line example

Hmm, perhaps these two examples should be wrapped using bullets:

Examples:
	- first example spanning multiple lines
	- second example

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    32 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From da@ski.org  Mon Nov 29 18:48:45 1999
From: da@ski.org (David Ascher)
Date: Mon, 29 Nov 1999 10:48:45 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <38426C4A.ACC76AB5@lemburg.com>
Message-ID: 

On Mon, 29 Nov 1999, M.-A. Lemburg wrote:

> This raises the question of whether to parse or evaluate the
> loaded module. Evaluation has the benefit of providing "automatic"
> context, i.e. the symbols defined in the global namespace
> are exactly the ones relevant for class definitions, etc. It
> probably makes contruction of interdepence graphs a lot easier
> to write. On the downside you have unwanted side effects due to
> loading different modules.

Good point. Too many modules "do things" on import, some exceedingly
expensive. I have written modules where the import never ends, by design
=3D).  I'm afraid that parsing is all we can do safely with the Python code=
=2E
That does make interpolation much more delicate.  Maybe we can do
everything but string interpolation w/ parsing, and then defer string
interpolation until and if the module can be evaluated safely.  Somehow
we'd need to indicate to the docstring processor whether that evaluation
is safe or not.

> Some notes on the proposal:
>=20
> =B7 Mentioning the function/method signature is ok, but sometimes
>   not needed since e.g. the byte code has enough information to
>   deduce the signature from it. This is not true for builtin
>   function which is probably the reason for all builtin doc
>   strings to include the signature.

Right.  It's not true for builtins, extension module functions, and I'm
not sure how easy it is for JPython code.  I have no problem with somehow
making it easy to omit those in cases where the information can be
obtained through the bytecode.

> =B7 I would extend the reference scheme to a lookup in the module
>   globals in case the local one (in the Reference section) fails.
>   You could then write e.g. "For details see the [string] module."
>   and the doc tool would then generate some hyperlink to the
>   string module provided the string module is loaded into the
>   global namespace.

Sounds good to me!

> =B7 Standard symbols like __version__ could be included and used
>   by the doc tool per default without the user specifying
>   any special "Version:: %(__version__)s" % globals() tags.

Fine.  I think that falls somewhat outside of the 'docstring' proposal,
but I agree with it.

--david

PS: Marc-Andre, how do you get these nice bullet characters in your
    emails? What character is that? =3D)



From da@ski.org  Mon Nov 29 18:54:47 1999
From: da@ski.org (David Ascher)
Date: Mon, 29 Nov 1999 10:54:47 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <3842C5F4.91B2C4BB@lemburg.com>
Message-ID: 

On Mon, 29 Nov 1999, M.-A. Lemburg wrote:

> How would the following be handled:
> 
> Arguments: file -- a file like object
> 	   mode -- file mode indicator as defined in [__builtin__.open]

That, btw, is illegal -- the block must either be a single-line block or
an indented block.

> Arguments: buffersize -- optional buffer size in bytes
> 
> that is, what happens if a keyword appears twice ? In the above
> case an error should be raised, but sometimes this may be
> useful:

Agreed -- I made a similar point in another email which waved 'hi!' to
yours as they crossed somewhere over the atlantic. =)

> Example:
> 	first multi-line example
> 
> Example:
> 	second multi-line example
> 
> Hmm, perhaps these two examples should be wrapped using bullets:
> 
> Examples:
> 	- first example spanning multiple lines
> 	- second example

Depends on the case.  In a long docstring, one might want to have several
sections, each with Examples: subsections.

I propose that part of the definition of a keyword is (along with any
special parsing rules) whether it can be duplicated in a docstring.

--david



From friedrich@pythonpros.com  Mon Nov 29 19:10:19 1999
From: friedrich@pythonpros.com (Robin Friedrich)
Date: Mon, 29 Nov 1999 13:10:19 -0600
Subject: [Doc-SIG] docstring grammar
References: 
Message-ID: <002301bf3a9d$63542a80$f25728a1@UNITEDSPACEALLIANCE.COM>

Some people on this list should remember the development days of gendoc and
it's cleaner successor pythondoc written by Dan Larsson (gosh I hope I'm not
the only one)! This thread rehashes much of what has already been discussed.
We pleaded back then for ideas/opinions/hacked code to help improve the
working code Dan wrote but got little response. I'm glad to see folks
thinking along these lines again. Please take a look at pythondoc and use it
as a starting point for a full featured documentation generator. It uses the
structured text approach for doc string parsing, and has options for either
parsing the source or importing the module to gather metadata, (the later is
necessary to document C modules).
-Robin Friedrich
See:
http://starship.python.net/crew/danilo/



From da@ski.org  Mon Nov 29 19:22:45 1999
From: da@ski.org (David Ascher)
Date: Mon, 29 Nov 1999 11:22:45 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <002301bf3a9d$63542a80$f25728a1@UNITEDSPACEALLIANCE.COM>
Message-ID: 

On Mon, 29 Nov 1999, Robin Friedrich wrote:

> Some people on this list should remember the development days of gendoc and
> it's cleaner successor pythondoc written by Dan Larsson (gosh I hope I'm not
> the only one)! 

Yes, I remember it.  Thanks for the reminder and pointer, Robin!

> We pleaded back then for ideas/opinions/hacked code to help improve the
> working code Dan wrote but got little response. 

FWIW, I think that one problem gendoc/pythondoc had in terms of strategy
was that it was billed as a 'tool'.  I think that if we establish a
'blessed standard' then any standard-compliant tool has a guaranteed user
base, and has a far greater likelihood of long-term success.  Also, once
the format is documented, then folks who don't like gendoc or for whatever
reason want to do it 'their own way' can still do it in a compatible way.

I'll start digging in gendoc to see the differences between its format and
what I've been discussing.  I'd love to leverage it to build a reference
implementation.

Dan Larsson, are you reading this discussion?  We could use your
experience here!

--david




From friedrich@pythonpros.com  Mon Nov 29 19:44:23 1999
From: friedrich@pythonpros.com (Robin Friedrich)
Date: Mon, 29 Nov 1999 13:44:23 -0600
Subject: [Doc-SIG] docstring grammar
References: 
Message-ID: <002d01bf3aa2$23e9c8a0$f25728a1@UNITEDSPACEALLIANCE.COM>

http://www.python.org/sigs/doc-sig/status.html

Contains an old summary of the formatting rules for Structured Text use in
doc strings.

Oddly Dan's subscription to this list is disabled, probably from an old
address. The latest address I have for him is Daniel.Larsson@telia.com

----- Original Message -----
From: David Ascher 
To: Robin Friedrich 
Cc: ; Daniel Larsson

Sent: Monday, November 29, 1999 1:22 PM
Subject: Re: [Doc-SIG] docstring grammar


> On Mon, 29 Nov 1999, Robin Friedrich wrote:
>
> > Some people on this list should remember the development days of gendoc
and
> > it's cleaner successor pythondoc written by Dan Larsson (gosh I hope I'm
not
> > the only one)!
>
> Yes, I remember it.  Thanks for the reminder and pointer, Robin!
>
> > We pleaded back then for ideas/opinions/hacked code to help improve the
> > working code Dan wrote but got little response.
>
> FWIW, I think that one problem gendoc/pythondoc had in terms of strategy
> was that it was billed as a 'tool'.  I think that if we establish a
> 'blessed standard' then any standard-compliant tool has a guaranteed user
> base, and has a far greater likelihood of long-term success.  Also, once
> the format is documented, then folks who don't like gendoc or for whatever
> reason want to do it 'their own way' can still do it in a compatible way.
>
> I'll start digging in gendoc to see the differences between its format and
> what I've been discussing.  I'd love to leverage it to build a reference
> implementation.
>
> Dan Larsson, are you reading this discussion?  We could use your
> experience here!
>
> --david
>




From mal@lemburg.com  Mon Nov 29 20:55:35 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Nov 1999 21:55:35 +0100
Subject: [Doc-SIG] docstring grammar
References: 
Message-ID: <3842E847.DE11400D@lemburg.com>

David Ascher wrote:
> 
> On Mon, 29 Nov 1999, M.-A. Lemburg wrote:
> 
> > This raises the question of whether to parse or evaluate the
> > loaded module. Evaluation has the benefit of providing "automatic"
> > context, i.e. the symbols defined in the global namespace
> > are exactly the ones relevant for class definitions, etc. It
> > probably makes contruction of interdepence graphs a lot easier
> > to write. On the downside you have unwanted side effects due to
> > loading different modules.
> 
> Good point. Too many modules "do things" on import, some exceedingly
> expensive. I have written modules where the import never ends, by design
> =).  I'm afraid that parsing is all we can do safely with the Python code.
> That does make interpolation much more delicate.  Maybe we can do
> everything but string interpolation w/ parsing, and then defer string
> interpolation until and if the module can be evaluated safely.  Somehow
> we'd need to indicate to the docstring processor whether that evaluation
> is safe or not.

I think gendoc did this with a command line switch... well the
early versions did (I think under a different name though, or
perhaps the name is different now ?).
 
> > Some notes on the proposal:
> >
> > · Mentioning the function/method signature is ok, but sometimes
> >   not needed since e.g. the byte code has enough information to
> >   deduce the signature from it. This is not true for builtin
> >   function which is probably the reason for all builtin doc
> >   strings to include the signature.
> 
> Right.  It's not true for builtins, extension module functions, and I'm
> not sure how easy it is for JPython code.  I have no problem with somehow
> making it easy to omit those in cases where the information can be
> obtained through the bytecode.

There's code in hack.py for the extraction and also a more
generic module by Fredrik Lundh for building signature strings.

> > · I would extend the reference scheme to a lookup in the module
> >   globals in case the local one (in the Reference section) fails.
> >   You could then write e.g. "For details see the [string] module."
> >   and the doc tool would then generate some hyperlink to the
> >   string module provided the string module is loaded into the
> >   global namespace.
> 
> Sounds good to me!

Without too much parsing overhead this only works for
the evaluation technique though. Would be nice to have...
even if it doesn't work for some reason (the doc tool could
then just produce some different markup for the reference
string, e.g. put it in italics).
 
> > · Standard symbols like __version__ could be included and used
> >   by the doc tool per default without the user specifying
> >   any special "Version:: %(__version__)s" % globals() tags.
> 
> Fine.  I think that falls somewhat outside of the 'docstring' proposal,
> but I agree with it.

True. It's something I've added to my hack.py formatting
functions and I thought it would be nice to have... (it also
encourages people to use __version__).
 
> --david
> 
> PS: Marc-Andre, how do you get these nice bullet characters in your
>     emails? What character is that? =)

It's chr(183) in Latin-1: the famous center dot ;-) I've tweaked
my keyboard setup to have it handy...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    32 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From da@ski.org  Mon Nov 29 23:28:28 1999
From: da@ski.org (David Ascher)
Date: Mon, 29 Nov 1999 15:28:28 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: 
Message-ID: 

On Mon, 29 Nov 1999, Edward Welbourne wrote:

> As I remember, gendoc used *emphasis* and **strong**, which does
> adequately (and may be in use by some of us), though I can see a case
> against the doubling.

Fine with me.

> to be honest, equating it to PRE I don't like; Code deserves to be a
> keyword which switches the context-sensitive parsing to expecting python
> code...

All good points, and fine with me.  

--david




From Daniel.Larsson@telia.com  Mon Nov 29 23:48:48 1999
From: Daniel.Larsson@telia.com (Daniel Larsson)
Date: Tue, 30 Nov 1999 00:48:48 +0100
Subject: [Doc-SIG] docstring grammar
References:  <002d01bf3aa2$23e9c8a0$f25728a1@UNITEDSPACEALLIANCE.COM>
Message-ID: <002901bf3ac4$4a3a50c0$3a1e54c3@danilo>

Hmm, I think I had an old email address on the list, and since the latest
employment
haven't enabled me to do much Python programming :-(, I sort of forgot to
fix the
problem. I'll fix that. There is an archive for the list, right? So I can
catch up
on what you all are talking about.

Daniel Larsson

----- Original Message -----
From: Robin Friedrich 
To: David Ascher 
Cc: ; 
Sent: Monday, November 29, 1999 8:44 PM
Subject: Re: [Doc-SIG] docstring grammar


> http://www.python.org/sigs/doc-sig/status.html
>
> Contains an old summary of the formatting rules for Structured Text use in
> doc strings.
>
> Oddly Dan's subscription to this list is disabled, probably from an old
> address. The latest address I have for him is Daniel.Larsson@telia.com
>
> ----- Original Message -----
> From: David Ascher 
> To: Robin Friedrich 
> Cc: ; Daniel Larsson
> 
> Sent: Monday, November 29, 1999 1:22 PM
> Subject: Re: [Doc-SIG] docstring grammar
>
>
> > On Mon, 29 Nov 1999, Robin Friedrich wrote:
> >
> > > Some people on this list should remember the development days of
gendoc
> and
> > > it's cleaner successor pythondoc written by Dan Larsson (gosh I hope
I'm
> not
> > > the only one)!
> >
> > Yes, I remember it.  Thanks for the reminder and pointer, Robin!
> >
> > > We pleaded back then for ideas/opinions/hacked code to help improve
the
> > > working code Dan wrote but got little response.
> >
> > FWIW, I think that one problem gendoc/pythondoc had in terms of strategy
> > was that it was billed as a 'tool'.  I think that if we establish a
> > 'blessed standard' then any standard-compliant tool has a guaranteed
user
> > base, and has a far greater likelihood of long-term success.  Also, once
> > the format is documented, then folks who don't like gendoc or for
whatever
> > reason want to do it 'their own way' can still do it in a compatible
way.
> >
> > I'll start digging in gendoc to see the differences between its format
and
> > what I've been discussing.  I'd love to leverage it to build a reference
> > implementation.
> >
> > Dan Larsson, are you reading this discussion?  We could use your
> > experience here!
> >
> > --david
> >
>
>



From fdrake@acm.org  Tue Nov 30 00:09:36 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 29 Nov 1999 19:09:36 -0500 (EST)
Subject: [Doc-SIG] Re: SMTP?
In-Reply-To: <00cc01bf3739$a1d83d30$f29b12c2@secret.pythonware.com>
References: <19991124171120.1623.qmail@hotmail.com>
 <19991124202751.A5717@stopcontact.palga.uucp>
 <00cc01bf3739$a1d83d30$f29b12c2@secret.pythonware.com>
Message-ID: <14403.5568.677456.595721@weyr.cnri.reston.va.us>

Fredrik Lundh writes:
 > I once contributed a (IMHO) better example, which
 > 1) actually imported all modules that were used in
 > the example, 2) used more reasonable python con-
 > structs (raw_input instead of that prompt hack, etc),
 > and 3) showed how to add the basic headers to the
 > message body.
 > 
 > as far as I can tell, only (1) made it into the docs...

  I don't recall the specific patch (though there have been patches to 
that example), so I probably just missed it.  I've just checked in
some changes (based on your comments here) to the maintenance branch,
so the next version should be better.
  Sorry for not getting you patch integrated!


  -Fred

--
Fred L. Drake, Jr.	     
Corporation for National Research Initiatives


From mhammond@skippinet.com.au  Tue Nov 30 03:29:44 1999
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Tue, 30 Nov 1999 14:29:44 +1100
Subject: [Doc-SIG] docstring grammar
In-Reply-To: 
Message-ID: <006b01bf3ae3$25b6fe00$0501a8c0@bobcat>

> After a little thought, I'm tempted to remove the ::
> requirement as well.

I agree this would be a good thing.  I originally intended to reply in
context to all the good suggestions - however, I dont look like
finding time until after Christmas :-(

So here is my 2c worth, mainly echoing comments from others:

Drop the absolute requirement for the whitespace, especially with
bulleted lists.  People will generally not be editing these strings in
a word-processor, so will have control over the line breaks.

Thus:
* Any line starting with a word followed by a colon can be considered
a keyword.  If you dont want this, just make sure its not the first
word on the line.
* A star or dash starting a line can be considered a new list item.
Again, if it is truly a hyphen or whatever else, just adjust your line
wrap slightly so it is no longer the first word.

Other random thoughts:
* The [blah] notation is good, but needs to be well defined.  eg,
"[module.function]" when used in the context of a package should use
the same "module scoping" that Python itself uses.  However, the use
of brackets may conflict with people who use inline code (rather than
an example "block" - maybe something like "@" could be used?
@module.function@ would be reasonable.

* IMO, importing the module to extract this information is fine.  For
the 1% of cases where it is not and the author of the module needs to
use the tool, we could offer a hack - eg "sys.doc_building" will be
defined when the tool is running, so could fine tune their code
appropriately.  For the vast majority of cases, I guess that importing
would be just fine and make the tool simpler, thereby giving more
chance of it one day existing :-)  Indeed, do it the simple way, and
the first person who needs the parse-only option can help code it :-)

* Example/test code should be clearly identifiable.  Tim Peters
docstring tester could also be hacked to work with with format.
Further, it should be possible to have lots of discrete sample code,
each with their own discussion - eg:
"""
The following code shows how to do this:
Example:
  def foo():
    etc

/Example:
The following code shows how to do that:
Example:
  def bar():
    etc

As a final note:  The tool should be written with distinct "generate"
and "collate" phases, simply to resolve the cross-references.  It is
unreasonable to expect that all cross-references will be capable of
being resolved in a single pass.  Note sure exactly what this means
from an implementation POV, but it is important.

Thats about it.  I really like this, and feel it can is both powerful
and extensible enough to grow with us.  All we need now is the tool
:-)

Mark.



From da@ski.org  Tue Nov 30 07:04:18 1999
From: da@ski.org (David Ascher)
Date: Mon, 29 Nov 1999 23:04:18 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <006b01bf3ae3$25b6fe00$0501a8c0@bobcat>
Message-ID: 

On Tue, 30 Nov 1999, Mark Hammond wrote:

> * The [blah] notation is good, but needs to be well defined.  eg,
> "[module.function]" when used in the context of a package should use
> the same "module scoping" that Python itself uses.  However, the use
> of brackets may conflict with people who use inline code (rather than
> an example "block" - maybe something like "@" could be used?
> @module.function@ would be reasonable.

I personally would prefer to keep [] for references and introduce @..@ (or
some other delimiter) for inline code, mostly because [] is so common in
journals as a way of indicating bibliographic references.  I do *not* like
StructuredText's use of quotes to do inline code markup.

> * IMO, importing the module to extract this information is fine.  For
> the 1% of cases where it is not and the author of the module needs to
> use the tool, we could offer a hack - eg "sys.doc_building" will be
> defined when the tool is running, so could fine tune their code
> appropriately.  For the vast majority of cases, I guess that importing
> would be just fine and make the tool simpler, thereby giving more
> chance of it one day existing :-)  Indeed, do it the simple way, and
> the first person who needs the parse-only option can help code it :-)

I see.  So the workaround for those scripts which can't be imported is to
start them with:

import sys; if sys.doc_building: sys.exit()

Not too bad.

> * Example/test code should be clearly identifiable.  Tim Peters
> docstring tester could also be hacked to work with with format.

I need to go back and look at Tim's code again.

> Further, it should be possible to have lots of discrete sample code,
> each with their own discussion - eg:
> """
> The following code shows how to do this:
> Example:
>   def foo():
>     etc
> 
> /Example:
> The following code shows how to do that:
> Example:
>   def bar():
>     etc

That would be written (with the current proposal):

  The following code shows how to do this:
    Example:
      def foo():
        etc
 
  The following code shows how to do that:
    Example:
      def bar():
        etc

Is that ok w/ you?

--david



From tim_one@email.msn.com  Tue Nov 30 07:50:59 1999
From: tim_one@email.msn.com (Tim Peters)
Date: Tue, 30 Nov 1999 02:50:59 -0500
Subject: [Doc-SIG] docstring grammar
In-Reply-To: 
Message-ID: <000b01bf3b07$a465ad40$c92d153f@tim>

[MarkH]
> * Example/test code should be clearly identifiable.  Tim Peters
> docstring tester could also be hacked to work with with format.

[DavidA]
> I need to go back and look at Tim's code again.

I already did .  Tim's code looks for:

   ^\s*>>>

and then sucks up everything following until the next all-whitespace line or
end of docstring (whichever comes first).

That is, I figured the contents of an interactive shell window didn't need
any markup beyond the leading PS1 Python already sticks there.  Given that
doctest.py is meant to be usable with near-zero effort, it wouldn't do to
require more markup than that.

Luckily, it almost fits your definition of a paragraph already.  It
shouldn't be any real effort to declare that ">>>" introduces a
structureless code paragraph extending until the next all-whitespace etc --
given that it's a format for Python docstrings, Python's own output deserves
some special treatment .

As to whether doctest should be fiddled to try to interpret some other form
of markup too, I don't think so.  The markup it inherits from the Python
shell is both sufficient and pleasant for its users.  Any other kind of
embedded sample code almost certainly isn't intended to be auto-verified, so
doctest *should* ignore it.

Nothing you're likely to do with docstrings is going to create problems for
doctest, so the only question is whether doctest's conventions create
problems for docstring markup.  I think they do now, but "shouldn't":
anyone pasting in an interactive session, whether for use with doctest or
for some other purpose, is going to want it treated as a code block.

full-speed-ahead-ly y'rs  - tim




From mhammond@skippinet.com.au  Tue Nov 30 08:38:25 1999
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Tue, 30 Nov 1999 19:38:25 +1100
Subject: [Doc-SIG] docstring grammar
In-Reply-To: 
Message-ID: <007b01bf3b0e$45cdcf40$0501a8c0@bobcat>

> I personally would prefer to keep [] for references and
> introduce @..@ (or
> some other delimiter) for inline code, mostly because [] is
> so common in
> journals as a way of indicating bibliographic references.  I

Fair enough.

> I see.  So the workaround for those scripts which can't be
> imported is to
> start them with:
>
> import sys; if sys.doc_building: sys.exit()
>
> Not too bad.

I more had in mind:

if sys.doc_building:
  # Normally critical we do this.
  dont_do_something_really_expensive()

We dont need to execute the bulk of the code, just import the module
and get a few of the symbols.

> That would be written (with the current proposal):
>
>   The following code shows how to do this:
>     Example:
>       def foo():
>         etc
>
>   The following code shows how to do that:
>     Example:
>       def bar():
>         etc
>
> Is that ok w/ you?

Perfect.

Mark.



From mhammond@skippinet.com.au  Tue Nov 30 08:45:18 1999
From: mhammond@skippinet.com.au (Mark Hammond)
Date: Tue, 30 Nov 1999 19:45:18 +1100
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <007b01bf3b0e$45cdcf40$0501a8c0@bobcat>
Message-ID: <007c01bf3b0f$4339c170$0501a8c0@bobcat>

> I more had in mind:
> 
> if sys.doc_building:
>   # Normally critical we do this.
>   dont_do_something_really_expensive()

Sheesh - I obviously meant:

if not sys.doc_building:
  do_something_really_expensive()

But Im sure you got my drift :-)

Mark.



From mal@lemburg.com  Mon Nov 29 21:59:23 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Nov 1999 22:59:23 +0100
Subject: [Doc-SIG] docstring grammar
References: 
Message-ID: <3842F73B.CB64BC8B@lemburg.com>

David Ascher wrote:
> 
> > Some notes on the proposal:
> >
> > · Mentioning the function/method signature is ok, but sometimes
> >   not needed since e.g. the byte code has enough information to
> >   deduce the signature from it. This is not true for builtin
> >   function which is probably the reason for all builtin doc
> >   strings to include the signature.
> 
> Right.  It's not true for builtins, extension module functions, and I'm
> not sure how easy it is for JPython code.  I have no problem with somehow
> making it easy to omit those in cases where the information can be
> obtained through the bytecode.

Perhaps we could use a convention: if the first line starts
with a Python identifier followed by '(' and the identifier
matches the name of the doc string owning object (function or
method), then no byte code lookup is done. Otherwise such
a lookup causes a new first line to be prepended to the
processed doc string (with '-> ?' return value).

This should cover most cases.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    32 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/




From tony@lsl.co.uk  Tue Nov 30 10:31:43 1999
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 30 Nov 1999 10:31:43 -0000
Subject: [Doc-SIG] docstring grammar
In-Reply-To: 
Message-ID: <001301bf3b1e$1875bc50$f0c809c0@lslp7o.lsl.co.uk>

All of the following are minor nit-pickings, because it all looks VERY GOOD.
(Personally, I'm not too worried about the tool as-such, I just want the
grammar defined so I can use it!).

David Ascher wrote:
> I forgot two markups:  *this* is bold and _this_ is italic.  Bold and
> italic markups must begin and end within a paragraph (I'd say 'within a
> sentence' but I don't want to complicate the parser with a sentence type).
> No space allowed between *'s and _'s and their contents.

And I hope it's also possible to nest them arbitrarily, with some "sensible"
effect (yes, this *is* useful in english text, and I would not want to lose
it in documentation!). [Technically, that's a viewer problem, but I want the
grammar to *say* this can be done, so the software writers have an onus on
them to cope with it.]

Marc-Andre Lemburg wrote:
> I'd suggest using '^ *[a-zA-Z_]+[a-zA-Z_0-9]*: *' as RE for
> keywords, i.e. keywords are Python identifiers immediatly followed
> by a colon starting a line of a doc string. That should avoid
> most complications, I guess.

Sounds sensible to me - the advantages outweigh the disadvantages.

On Tim Peters' test texts - I think this is actually an important enough
idea that it might warrant its own keyword - perhaps "TestScript" (no, I
know that's clumsy) - thus giving subliminal encouragement to the concept
(hmm - must use it someday, he said guiltily). This would also allow us to
distinguish odd chunks of code which are NOT test scripts (a new ability,
since at the moment the tester will try to use all >>> text?), which I think
could sometimes be useful...

David Ascher wrote:
> How about another keyword?
>
>  List:
>     * foo
>     * bar
>     * spam

I would vote against that, firstly on the grounds that it doesn't read well,
and secondly that it is probably the sort of thing that people wouldn't do
(!). As with what others think, I believe we can hack lists without the
keyword (is this now the consensus?).

In another message, David continued:
> I propose that part of the definition of a keyword is (along with any
> special parsing rules) whether it can be duplicated in a docstring.

Hmm - then I think we're going to need some serious support in "The Standard
Editors" to give a hint about whether something can be included more than
once, since I have a sneaky feeling we're getting quite a lot of keywords
(is it about 7 things that humans remember easily?). On the other hand,
modulo the clever peoples' time, I rate that as "not a problem".

NB: how picky is the tool going to be about getting the indentation exactly
right? I'm not fussed by it being very picky, but I know I'm odd that way.

David Hammond votes for doing lists by detecting the bullets (good), but I'd
like to reserve more than two characters (hyphen and asterisk are OK, but I
do sometimes use 3 level lists, and would like another one - on the other
hand, I'm not sure what other than @ and he wants that for something else...
hmm - if we're not worried by hyphen confusing us with negative numbers,
maybe plus would be sensible).

I also tend to agree with Davids Hammond and Ascher that [ and ] are very
valuable AS TEXT. The use of @..@ is visually very obvious to me, which is
presumably a good thing in context, so I also vote for that (gosh, I've just
voted positively for something delimited by the same character at start and
end - obviously the start of the slippery road to hell).

Whilst I don't know owt about parsing (well, more precisely, parse trees
scare me), I don't see any of the proposals so far as giving any great
problems with extracting information from the text.

David (Ascher) - is it time to re-release your initial "docstring grammar"
email with the comments you're happy with edited in? I *really* don't have
time to do it, or I already would...

Tibs
--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.demon.co.uk/
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)
[I've read it twice. I've thought it over. I'm sending it anyway.]



From Edward Welbourne   Tue Nov 30 12:44:47 1999
From: Edward Welbourne  (Edward Welbourne)
Date: Tue, 30 Nov 1999 12:44:47 +0000
Subject: [Doc-SIG] docstring grammar
In-Reply-To: 
References: <3842C5F4.91B2C4BB@lemburg.com>
 
Message-ID: 

David Ascher wrote:
> I propose that part of the definition of a keyword is (along with any
> special parsing rules) whether it can be duplicated in a docstring.

FAPP we can approach both this and the `context-sensitive' stuff from
the same point of view as SGML: precisely because

Blah:
    something legitimate
    in a Blah block

maps directly to


something legitimate
within a BLAH


so all the kinds of rule that a DTD could have imposed on BLAH are
sensible things to impose on Blah.  In particular, rather than `whether
it can be duplicated in a docstring' we have a nested tree structure in
our hands, so we can ask whether it can be duplicated as a child of its
parent.  Suppose Date to be unique:

Author: David Ascher
Release:
    Date: 1999/Nov/28
    Name: proto-post-gendoc:0.2
    Media: e-mail
Bugs:
    Report:
        Date: 1999/Nov/29

        As initially specified, the denotation for italic conflicts with
        python identifiers where these genuinely start and end in
        underscore.

        Status: resolved, adopting gendoc's approach

    Report:
        Date: ...

in which Date is `unique' but shows up many times in one doc-string.  I
would suggest that a tag is either unique in all contexts that allow for
it, or in none (so we don't have ickiness in which *some* tags allow
several Date subordinates - that kind of stuff makes it harder for folk
to remember what's unique and what isn't).  The right layer

A note on tags: we seem to be headed for `python identifier followed by
a colon'.  I'd like to argue for RFC 822 headers - that is,
specifically, to allow hyphens, so as to allow

Bugs:
    Reports-to: doc-sig@python.org
    Report: ... as above ...

and, indeed, to change Bugs: to Known-bugs:

Of course we could use _, but hyphen comes more naturally to text and
the parser for our keywords (unlike that for python identifiers) doesn't
have to worry about subtraction as `something we might be doing here' to
confuse with recognising the keyword.


For the sake of a coarse reprise of where I think we are:

Within docstrings, paragraphs, `text fields', descriptive and bulleted
lists are marked up using pretty much what gendoc used, though we seem
to be making some tweaks.  The main addition of David's proposal is a
structured data format entirely analogous to a *ML's begin-end
structure, but transformed to indent/dedent format - in exactly the same
way that one transforms the begin-end structure of C or Pascal into
python code.  This gets us all the desiderata that XML would provide,
but it does it in a pythonic format.

The typical block allows (depending on the keyword which introduced it)
an assortment of keywords to be used to introduce sub-blocks; it may
also allow paragraphs and/or lists within it.  The docstring is a block
which is willing to hold all `outer' structural groups (i.e. top-level
keywords, with their blocks, and paragraphs).  A paragraph is a block
which (possibly along with the blocks started by some keywords) may have
sub-blocks which are list items.

We can effectively write the rules for all this as a DTD and parse it
into a form which can be manipulated *as if* it had been obtained by
parsing a lump of XML - in particular, it should be trivial to perform
XSL-ish tree transformations to convert it to whatever DTD The Manual
wants as its input; while leaving ample scope for the inventive
toolwright to perform sophisticated information massaging on docstrings,
and not obliging us to use all that ugly XML taggery in the source.


We need a moderately short list (of order a dozen) of `top-level' tags:
subordinate to each we may introduce a few others (context sensitivity)
but simplicity demands vocabulary restraint and re-use.  The top level
seems to run to:

In all docstrings:
   Author(s), Release, Contributors
   Example(s), Test-script, Code
   Warning

In docstrings of callables:
   Argument(s), Return, Raises

In docstrings of classes:
   Supports/Implements/Mimics... (one synonym)
   Subclassing (for folk using this class as a base - what to override)
   Attributes, Methods (each supporting Private and Public as subordinates)

In docstrings of modules:
   Contents

so 7 universally-applicable keywords, (up to) the rest of a dozen in
each of the specific contexts for docstrings.  I would reckon we can
keep to about another dozen keywords spread around as subordinates of
the above (Date, Private, Public, Expect (for Test-script), Required &
Optional (for arguments), ...).


On test-scripts (in the manner of Tim Peters) we may not need a
Test-script keyword at all: simply using >>> is how the tool recognises
it, and there's nothing to stop the docstring parser recognising this as
a special indent mark that transforms to target XML *as if* it had come
from a block introduced by Test-script:.

	Eddy.


From Manuel Gutierrez Algaba   Tue Nov 30 15:31:09 1999
From: Manuel Gutierrez Algaba  (Manuel Gutierrez Algaba)
Date: Tue, 30 Nov 1999 15:31:09 +0000 (GMT)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: 
Message-ID: 

On Mon, 29 Nov 1999, Edward Welbourne wrote:

> Manuel: if David includes `Keyword::' in his bits and pieces, would
> 
> Keyword::
>      indexing
>      keyword
>      data retrieval
>      searching
> 
> (within a doc-string) contain the information you've been wanting to
> take out of your
> 
> \indexaboutindexing
> \indexaboutkeyword
> \indexretrieval
> \indexsearching

No. They're completely different things. In fact there's consensus,
David insists on "bullets" for args and javadoc-ish things, and 
I insist on Encyclopedia-Higher-Level-python-stuff. My system
, currently, is for marking/sorting "general" info.

> I know you have bits that define an indexing command that expands to
> several indexing commands, which this lacks: but could the same effect
> be arrived at by turning your set of indexing command definitions into
> an `expert system' that expands some keywords ?

Yes, expert system are fine, but the greatest difficulty with my 
proposal is that people *must* input hundreds of attributed info if
we want to anything useful. Expert system is phase 2, when we
have to group indexes and extract info.

Regards/Saludos
Manolo
www.ctv.es/USERS/irmina    /TeEncontreX.html   /texpython.htm
 /SantisimaInquisicion/index.html 

  Everything in this book may be wrong. -- Messiah's Handbook : Reminders for the Advanced Soul







From Edward Welbourne   Tue Nov 30 14:35:21 1999
From: Edward Welbourne  (Edward Welbourne)
Date: Tue, 30 Nov 1999 14:35:21 +0000
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <006b01bf3ae3$25b6fe00$0501a8c0@bobcat>
References: 
 <006b01bf3ae3$25b6fe00$0501a8c0@bobcat>
Message-ID: 

> Thus:
> * Any line starting with a word followed by a colon can be considered
> a keyword.  If you dont want this, just make sure its not the first
> word on the line.

Not happy.  A paragraph of text which precedes an example may be relied
upon to end in `for example:', in which the last contiguous block of
non-space characters is of length 8; if I modify an earlier part of the
paragraph, I'm going to ask my authoring tool (python-mode.el) to
reformat the paragraph, without necessarily being aware of a gotcha
waiting for me at the paragraph's end; my margins will be within 72
characters of one another, giving a roughly 1 in 9 chance that
`example:' ends up being alone on the last line ... gotcha.

A cure for this would just be to do keyword-recognition case
sensitively, and Capitalise keywords; otherwise, we have to insist on
either a dedent or a blank line preceding any keyword.  Which offends
folk worse: case sensitivity or needing a dedent/vspace ?


> * A star or dash starting a line can be considered a new list item.
> Again, if it is truly a hyphen or whatever else, just adjust your line
> wrap slightly so it is no longer the first word.

Alternatively, all lists use the same `item-introducer' character and
follow it with an optional character indicating what bullet to use.
Thus one might have (taking ~ as the introducer for the illustration)

  ~ outermost list, first item
  ~ outer second which may contain a subordinate
    ~ which is dedented so it can use the same introducer without
      confusion
    ~ and output formatters can chose different symbols
      in place of the star for successive nesting layers
    ~ by the way, should further lines line up with the text or the
      bullet ?  my reckoning is with the text ...
  ~ outer third, whose subordinate might want Roman numerals
    ~i so it indicates them thus
    ~i and can chose to leave the engine to sort out numbering
    ~iii but can effectively assert that one item (referred to
         elsewhere) has a particular number
    ~i without having to mention numbers for the rest
    ~i and of course 
       ~1 we can use the other numbering styles
       ~2 including alphabetic, upper or lower, using ~A or ~a.
       ~1 with use of first in series taken as `work out right number'
       ~7 but I think the tool should complain if you get later
          positions wrong: it's an assertion, and it indicates that this
          item is going to be referred to from other text as item 7 - I
          need to be told I got it wrong !  Obviously I've deleted a few
          items before this one without realising what's happening below ...
  ~ outer fourth
    ~o must the bullets in a given list all match ?
       ~. should stand for mid-dot, and star is likewise easy using *
    ~o I think so, anyway
      ~- dash is obvious and now unambiguous, as are + and =
    ~o mind you, o requires care: if it's the first item in a list, that
       list is going to use o as its bullet; but if it appears in a list
       which began with a ~a then we have to read it as item fifteen.
      ~ and if we're insisting on all items in a list having the same
        bullet, does it make sense to allow items after the first to
        just use an unadorned star meaning re-use of first item's
        symbol, thus saving us lots of editing when we want to change
        the symbol in use by a list, or shuffle an item from a sub-list
        out into its parent list (or etc.)
      ~ of course, ~ needn't be the bullet-introducer, we could use
        pretty much any punctuator as long as it doesn't obviously
        clash; candidate egs: #, @, $, %, &, * and even |
  ~ outer fifth
    ~ as for descriptive lists, I'd go with the old gendoc form, which

      uses double dash -- which just feels so natural, but

      needs vspace -- to separate items, given that -- might be used
      within an item on a later-than-first line.  I can live with this.

> Other random thoughts:
> * The [blah] notation is good, but needs to be well defined.  eg,
> "[module.function]" when used in the context of a package should use
> the same "module scoping" that Python itself uses.

The thing that saves [this] from being problematic is that the format in
which it was introduced presumed that one was going to use a brief
mnemonic as [this] word and end the docstring with a chunk which
explains the cross-references (new keyword: Xrefs ?) and, in particular,
tells the doc-string-reader which [tokens] actually have a translation,
the rest being left as typed; thus, if this paragraph appeared in a
docstring which says how to translate [this] (giving an xref and -
optionally - a text to use (default `this') in place of [this]), the
digested form would duly replace [this] but leave [tokens] as it is.

To further simplify life, I'd understood the [this] keys that are
translatable to insist on [nowhitespace] to save the parser most of its
`this might be an xref' pending decisions - which is why the Xrefs
section needs to at least have the option of specifying the text to be
used in place of [this] as well as the Xref to point it at.  What we're
doing is citation, which is widely done with [].

No need for [this] to be a [module.function] or anything like - the
Xrefs section provides the translation.

Xrefs:
   [gendoc] http://www.python.org/contrib/gendoc/
   [this] http://www.python.org/lists/doc-sig/hideous?with=data&as=you+will The present message
   [copy] string.copy the standard string copy function
   [etc] location sub sti tute

[sorry, all exhibited xrefs are bogus - illustrative only]
I'm sure that's only a minor paraphrase of a spec I saw a while ago on
this list ...

Of course, Xrefs might better be called Bibliography.

We can use as `location' some pythonic reference that can be resolved in
the ways that the suggested module.function semantics point to: indeed,
I would take this as what to try first, falling back on recognising
other stuff as URLs and similar.

> ... However, the use
> of brackets may conflict with people who use inline code (rather than
> an example "block" - maybe something like "@" could be used?
> @module.function@ would be reasonable.

With the above, can we evade this ?
The fact that [citations] are so widely used argues for the [form]; and
the fact that [anything with space in it] isn't a citation should make
all the `ordinary text' and `python denotations' [usages] unproblematic,
while leaving untranslated ones as [literal] uses of [ and ].  If
nothing else, I find my eye latches onto [cite] better than @cite@ ...
and bear in mind that @ has some other magic uses,

parser error - unclosed citation at line 137:
      Sender: eddyw@lsl.co.uk

All told, we seem to have a fairly good spec ... save for some
nitpickery ;^>

Tibs said:
> David (Ascher) - is it time to re-release your initial "docstring
> grammar"
and I confess that's something I'd like to see too.
After all, we have to have someone to play Gdo ...

	Eddy.


From Edward Welbourne   Tue Nov 30 15:32:00 1999
From: Edward Welbourne  (Edward Welbourne)
Date: Tue, 30 Nov 1999 15:32:00 +0000
Subject: [Doc-SIG] docstring grammar (erratum)
In-Reply-To: 
References: 
 <006b01bf3ae3$25b6fe00$0501a8c0@bobcat> 
Message-ID: 

I said:
    ~ and output formatters can chose different symbols
      in place of the star for successive nesting layers
and
      ~ and if we're insisting on all items in a list having the same
        bullet, does it make sense to allow items after the first to
        just use an unadorned star meaning re-use of first item's
        symbol, thus saving us lots of editing when we want to change
        the symbol in use by a list, or shuffle an item from a sub-list
        out into its parent list (or etc.)

but `unadorned star' should be `unadorned twiddle' - I missed a
conversion after being persuaded that *'s font role prohibits its use
as, for instance, *o or *1, which would match `begin italic': hence the
use of ~ and remarks about other candidates.  Likewise, in the first,
the presumption was that * is the default symbol, but I don't imagine
we'd be using ~ as a bullet much (well, we could), so that snippet
should have vanished.  The output formatters chose symbols as
appropriate: the parser just identifies the list structure and which
bits are subordinate to which others.

	Eddy.


From mal@lemburg.com  Tue Nov 30 16:58:36 1999
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 30 Nov 1999 17:58:36 +0100
Subject: [Doc-SIG] docstring grammar
References: 
 <006b01bf3ae3$25b6fe00$0501a8c0@bobcat> 
Message-ID: <3844023C.41B12CDD@lemburg.com>

Edward Welbourne wrote:
> 
> > Thus:
> > * Any line starting with a word followed by a colon can be considered
> > a keyword.  If you dont want this, just make sure its not the first
> > word on the line.
> 
> Not happy.  A paragraph of text which precedes an example may be relied
> upon to end in `for example:', in which the last contiguous block of
> non-space characters is of length 8; if I modify an earlier part of the
> paragraph, I'm going to ask my authoring tool (python-mode.el) to
> reformat the paragraph, without necessarily being aware of a gotcha
> waiting for me at the paragraph's end; my margins will be within 72
> characters of one another, giving a roughly 1 in 9 chance that
> `example:' ends up being alone on the last line ... gotcha.
> 
> A cure for this would just be to do keyword-recognition case
> sensitively, and Capitalise keywords; otherwise, we have to insist on
> either a dedent or a blank line preceding any keyword.  Which offends
> folk worse: case sensitivity or needing a dedent/vspace ?

Why not just raise an exception ? I don't think that the
usage of "some text:" is common in doc strings except for
maybe examples which should then adapted to use the new
"Example:" keyword.

Here's an example docstring... the format looks pretty nice,
IMHO.

"""
foo(bar,rab,oof) -> integer -- single line desription

Longer description spanning
multiple lines

Arguments:
    bar -- some string
    rab -- another string
    oof -- an integer       

Returns:
    42 in most cases

History:
    19991130 MAL -- Added oof argument
    19991101 MAL -- Created

"""

Not sure if this is already somewhere in the proposal, but
I would like to see '--' as indicator of a single line
text block. This would be useful in vertically compressing
the docstrings somewhat (and it already being used in the
signature line for such a purpose).
 
> > * A star or dash starting a line can be considered a new list item.
> > Again, if it is truly a hyphen or whatever else, just adjust your line
> > wrap slightly so it is no longer the first word.
> 
> Alternatively, all lists use the same `item-introducer' character and
> follow it with an optional character indicating what bullet to use.
> Thus one might have (taking ~ as the introducer for the illustration)
> 
> ...

Let's leave this to some list parser (are we starting to head
for NP-completeness again ;-).

> > Other random thoughts:
> > * The [blah] notation is good, but needs to be well defined.  eg,
> > "[module.function]" when used in the context of a package should use
> > the same "module scoping" that Python itself uses.

Right. It should ideally perform the same lookup as Python would
in the global namespace. The resulting object could then either
be handled recursively by the doc tool or simply stored by reference
for later use (e.g. via the file name of a module or the id of an
object).
 
> The thing that saves [this] from being problematic is that the format in
> which it was introduced presumed that one was going to use a brief
> mnemonic as [this] word and end the docstring with a chunk which
> explains the cross-references (new keyword: Xrefs ?) and, in particular,
> tells the doc-string-reader which [tokens] actually have a translation,
> the rest being left as typed; thus, if this paragraph appeared in a
> docstring which says how to translate [this] (giving an xref and -
> optionally - a text to use (default `this') in place of [this]), the
> digested form would duly replace [this] but leave [tokens] as it is.
> 
> To further simplify life, I'd understood the [this] keys that are
> translatable to insist on [nowhitespace] to save the parser most of its
> `this might be an xref' pending decisions - which is why the Xrefs
> section needs to at least have the option of specifying the text to be
> used in place of [this] as well as the Xref to point it at.  What we're
> doing is citation, which is widely done with [].
> 
> No need for [this] to be a [module.function] or anything like - the
> Xrefs section provides the translation.
> 
> Xrefs:
>    [gendoc] http://www.python.org/contrib/gendoc/
>    [this] http://www.python.org/lists/doc-sig/hideous?with=data&as=you+will The present message
>    [copy] string.copy the standard string copy function
>    [etc] location sub sti tute
> 
> [sorry, all exhibited xrefs are bogus - illustrative only]
> I'm sure that's only a minor paraphrase of a spec I saw a while ago on
> this list ...
> 
> Of course, Xrefs might better be called Bibliography.

Or perhaps "References:" as in David's proposal ?!

> We can use as `location' some pythonic reference that can be resolved in
> the ways that the suggested module.function semantics point to: indeed,
> I would take this as what to try first, falling back on recognising
> other stuff as URLs and similar.
> 
> > ... However, the use
> > of brackets may conflict with people who use inline code (rather than
> > an example "block" - maybe something like "@" could be used?
> > @module.function@ would be reasonable.
> 
> With the above, can we evade this ?
> The fact that [citations] are so widely used argues for the [form]; and
> the fact that [anything with space in it] isn't a citation should make
> all the `ordinary text' and `python denotations' [usages] unproblematic,
> while leaving untranslated ones as [literal] uses of [ and ].  If
> nothing else, I find my eye latches onto [cite] better than @cite@ ...
> and bear in mind that @ has some other magic uses,
> 
> parser error - unclosed citation at line 137:
>       Sender: eddyw@lsl.co.uk
> 
> All told, we seem to have a fairly good spec ... save for some
> nitpickery ;^>

Since [] is only used for lists in Python, we could
define the RE '\[[a-zA-Z0-9_.]+\]' for our purposes and
raise an exception in case the enclosed reference cannot
be mapped to a symbol in the global namespace (note: no
whitespace, no commas) which either evaluates to a function,
method, module or reference object.

Doc strings like "...use [None]*10 as argument..." will fail,
but are easily avoided by inserting some extra whitespace, e.g.
"...use [ None ] * 10 as argument...".

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    31 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From da@ski.org  Tue Nov 30 17:27:43 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 09:27:43 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <3844023C.41B12CDD@lemburg.com>
Message-ID: 

Mark Hammond:

> Thus: * Any line starting with a word followed by a colon can be
> considered a keyword.  If you dont want this, just make sure its not
> the first word on the line.

I agree with Edward on this one -- this is too fragile.

I consider the whitespace issue to be real only in the context of lists,
and I think that gendoc has shown that it's solvable within the context of
lists.  I stand by the keyword notation I presented:  either

   Keyword:
     text block
     spanning one or more lines 

or

   Keyword: one-line block

as long as they are both in separate paragraphs.

> Not sure if this is already somewhere in the proposal, but
> I would like to see '--' as indicator of a single line
> text block. This would be useful in vertically compressing
> the docstrings somewhat (and it already being used in the
> signature line for such a purpose).

Isn't that just redundant with the : notation?  Note that I don't mind a
little redundancy, but it's unpythonic.  

> > > * A star or dash starting a line can be considered a new list item.
> > > Again, if it is truly a hyphen or whatever else, just adjust your line
> > > wrap slightly so it is no longer the first word.
> > 
> > Alternatively, all lists use the same `item-introducer' character and
> > follow it with an optional character indicating what bullet to use.
> > Thus one might have (taking ~ as the introducer for the illustration)
> > 
> > ...
> 
> Let's leave this to some list parser (are we starting to head
> for NP-completeness again ;-).

Absolutely!

Mark:
> Other random thoughts:
> * The [blah] notation is good, but needs to be well defined.  eg,

MAL:

> Right. It should ideally perform the same lookup as Python would
> in the global namespace. The resulting object could then either
> be handled recursively by the doc tool or simply stored by reference
> for later use (e.g. via the file name of a module or the id of an
> object).

Edward:
> The thing that saves [this] from being problematic is that the format in
> which it was introduced presumed that one was going to use a brief
> mnemonic as [this] word and end the docstring with a chunk which
> explains the cross-references (new keyword: Xrefs ?) 

I think that both are needed.  I believe that the namespaces looked up
should be:
  1) the local namespace of the docstring -- i.e., the set of keywords
     defined in the "References" keyword block in the current docstring.
  2) the global namespace of the docstrings -- i.e. the set of keywords 
     defined in the "References" keyword block in the MODULE docstring.
  3) The global Python namespace for that module
  4) Some namespace corresponding to builtins & unimported modules, yet
     ill-defined.

The point of 2) is that I often want to introduce references that I use in
a given module at the level of a docstring, but then want to refer to
those documents in specific function docstrings.

(Good thing we don't have to worry about garbage collection with these
circular references =)

> Since [] is only used for lists in Python, we could
> define the RE '\[[a-zA-Z0-9_.]+\]' for our purposes and
> raise an exception in case the enclosed reference cannot
> be mapped to a symbol in the global namespace (note: no
> whitespace, no commas) which either evaluates to a function,
> method, module or reference object.
> 
> Doc strings like "...use [None]*10 as argument..." will fail,
> but are easily avoided by inserting some extra whitespace, e.g.
> "...use [ None ] * 10 as argument...".

I like that bit, especially since the 'complete' tagging of that example
would wrap [None]*10 in whatever inline code markup is chosen.

--david



From da@ski.org  Tue Nov 30 17:28:35 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 09:28:35 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: 
Message-ID: 

On Tue, 30 Nov 1999, Edward Welbourne wrote:

> Tibs said:
> > David (Ascher) - is it time to re-release your initial "docstring
> > grammar"
> and I confess that's something I'd like to see too.
> After all, we have to have someone to play Gdo ...

I must have missed Tibs' posting.  I agree, and I'll try to do that ASAP.

--david



From da@ski.org  Tue Nov 30 17:35:30 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 09:35:30 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <007b01bf3b0e$45cdcf40$0501a8c0@bobcat>
Message-ID: 

On Tue, 30 Nov 1999, Mark Hammond wrote:

> I more had in mind:
> 
> if sys.doc_building:
>   # Normally critical we do this.
>   dont_do_something_really_expensive()
>
> We dont need to execute the bulk of the code, just import the module
> and get a few of the symbols.

But lots of modules currently do everything in the leftmost column
(they're called "scripts" =).  Some of them never end (they're called "
"daemons" =).  I don't want to force someone to take their 'global' code
and put it in a function just to get around the docstring tool.  Anyway,
the point is moot, as one or the other solution will work, depending on
the script.

--david






From Edward Welbourne   Tue Nov 30 17:34:28 1999
From: Edward Welbourne  (Edward Welbourne)
Date: Tue, 30 Nov 1999 17:34:28 +0000
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <3844023C.41B12CDD@lemburg.com>
References: 
 <006b01bf3ae3$25b6fe00$0501a8c0@bobcat> 
 <3844023C.41B12CDD@lemburg.com>
Message-ID: 

> Since [] is only used for lists in Python, we could
> define the RE '\[[a-zA-Z0-9_.]+\]' for our purposes and
> raise an exception in case the enclosed reference cannot
> be mapped to a symbol in the global namespace (note: no
> whitespace, no commas) which either evaluates to a function,
> method, module or reference object.

umm ... hang on, two things seem stirred up here.  The proposal I
remember from ages ago and tried to echo has [token] and the token
doesn't have to be intelligible to the python engine: elsewhere in the
doc string, we'll have

References:
   [token] reference text

which the parsed docstring uses to decode each use of [token] that
appeared in the docstring.  Here, reference would normally be something
recognised by the python engine (and would be the thing I understand you
to be putting in [brackets]), but the Reference-handler might also cope
with it being, e.g., an URL.  The text that ends the reference becomes
the text of the `anchor' generated: 

-> ... and tried to echo has text and the token ...

note non-appeareance of [token] in the digested form: but if `text' had
been omitted from the Reference spec, [token] is the default text
(e.g. when what you're doing really is a citation and that's just how
you want it to appear).  Then any uses of [None] that appear in your doc
string, meaning `the list with one entry, None', it suffices that your
References section doesn't have an entry for [None] - the parsed
docstring will then just say [None] (and not even attempt to wrap an
anchor round it).

The only real relevance to forbidding [spaces within] the citation token
is to ensure that where authors use [square brackets] for parenthetical
remarks or as list denotations, the parser hasn't got to do the piece of
jiggery-pokery that marks it as `maybe a xref' and obliges it to come
back later to settle the maybe once it knows.  This cost will remain for
[None], but it'll be well-defined that the parser marks it as a maybe,
discovers that it isn't and settles on it being just text, not a reference.

Now, it seems to me that what you were describing was slightly different ...
am I merely confused ?

	Eddy.


From da@ski.org  Tue Nov 30 18:02:30 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 10:02:30 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <000b01bf3b07$a465ad40$c92d153f@tim>
Message-ID: 

On Tue, 30 Nov 1999, Tim Peters wrote:

> Luckily, it almost fits your definition of a paragraph already.  It
> shouldn't be any real effort to declare that ">>>" introduces a
> structureless code paragraph extending until the next all-whitespace etc --
> given that it's a format for Python docstrings, Python's own output deserves
> some special treatment .

The only question I suppose is whether one should require a keyword (Test:
or other) to keep the top-level syntax trivial, or special-case the
recognition of >>>-beginning paragraphs.

I'm leaning for the former, as it can evolve to the latter if there is
sufficient call for it from the user base, and I think it does keep the
code simpler.  But I'm willing to be swayed.

--david



From uche.ogbuji@fourthought.com  Tue Nov 30 18:07:51 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 30 Nov 1999 11:07:51 -0700
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: Your message of "Sat, 27 Nov 1999 09:17:37 PST."
 
Message-ID: <199911301807.LAA01801@localhost.localdomain>

> > Are you serious about the above ??? Noone is going to write that
> > in his docstrings...
> 
> It's not my favorite, but Uche mentioned that XML-ish syntax is much
> easier to parse.  While I don't really grant that point (or rather I think
> that the hill needs to be climbed once for all), I want to emphasize:

Huh?  Where? What? When? WHO?

I'm sure I _explicitly_ said that XML in doc-strings is a bad idea.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org




From da@ski.org  Tue Nov 30 18:12:03 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 10:12:03 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <199911301807.LAA01801@localhost.localdomain>
Message-ID: 

On Tue, 30 Nov 1999 uche.ogbuji@fourthought.com wrote:

> > It's not my favorite, but Uche mentioned that XML-ish syntax is much
> > easier to parse.  While I don't really grant that point (or rather I think
> > that the hill needs to be climbed once for all), I want to emphasize:
> 
> Huh?  Where? What? When? WHO?
> 
> I'm sure I _explicitly_ said that XML in doc-strings is a bad idea.

Indeed.  I was referring to the bit where you said:

> The reality, though, is that it's easier to go from XML or TeX to any
> of the many formats Python users want than it would be from
> Jim-Fulton-David-Ascher pythonic documentation format.  

I apologize if I misunderstood or misquoted you.

I believe that the docstring syntax being discussed would be mappable to
some form of XML, which I think is what we all agree is a good idea, so
that the docstrings can be used to build library docs.  Please jump in if
something is apparently agreed to which would make this hard!

--david



From friedrich@pythonpros.com  Tue Nov 30 18:24:50 1999
From: friedrich@pythonpros.com (Robin Friedrich)
Date: Tue, 30 Nov 1999 12:24:50 -0600
Subject: [Doc-SIG] docstring grammar
References:            <006b01bf3ae3$25b6fe00$0501a8c0@bobcat>            <3844023C.41B12CDD@lemburg.com> 
Message-ID: <004901bf3b60$30f49de0$f25728a1@UNITEDSPACEALLIANCE.COM>

Ed is correct.

Gendoc solved the HREF problem with:
"...An addition was made to support hypertext references. Hypertext
references are marked with double quotes in the body of the doc string. At
the end of the doc string will be a matching line starting with two dots '..
' and a space followed by the same quoted text and then followed by the
mapping (URL). This is patterned after the footnote notion in setext but is
easier on the eyes. For example, "Pythonland" will be marked as a
hyper-references to Python.org. If no matching trailing reference is found
then nothing is done. "

Which might be modified with current thinking to yield:
"""
Marking refs with [brackets], and at the end of the doc string place the
annotations ala bibliography one per line. Key "brackets" is placed in the
local namespace and used by other (lower) doc strings. In the gendoc
implementation if the key doesn't match anything stored in the ref mapping
no markup in done, so that things like [None]*5 are safe and no exception
need be raised.

[brackets] -> http://www.howto.python.org/rtfm.html
"""
-Robin

----- Original Message -----
From: Edward Welbourne 
To: M.-A. Lemburg 
Cc: ; 'David Ascher' ;

Sent: Tuesday, November 30, 1999 11:34 AM
Subject: Re: [Doc-SIG] docstring grammar


> > Since [] is only used for lists in Python, we could
> > define the RE '\[[a-zA-Z0-9_.]+\]' for our purposes and
> > raise an exception in case the enclosed reference cannot
> > be mapped to a symbol in the global namespace (note: no
> > whitespace, no commas) which either evaluates to a function,
> > method, module or reference object.
>
> umm ... hang on, two things seem stirred up here.  The proposal I
> remember from ages ago and tried to echo has [token] and the token
> doesn't have to be intelligible to the python engine: elsewhere in the
> doc string, we'll have
>
> References:
>    [token] reference text
>
> which the parsed docstring uses to decode each use of [token] that
> appeared in the docstring.  Here, reference would normally be something
> recognised by the python engine (and would be the thing I understand you
> to be putting in [brackets]), but the Reference-handler might also cope
> with it being, e.g., an URL.  The text that ends the reference becomes
> the text of the `anchor' generated:
>
> -> ... and tried to echo has text and the token
...
>
> note non-appeareance of [token] in the digested form: but if `text' had
> been omitted from the Reference spec, [token] is the default text
> (e.g. when what you're doing really is a citation and that's just how
> you want it to appear).  Then any uses of [None] that appear in your doc
> string, meaning `the list with one entry, None', it suffices that your
> References section doesn't have an entry for [None] - the parsed
> docstring will then just say [None] (and not even attempt to wrap an
> anchor round it).
>
> The only real relevance to forbidding [spaces within] the citation token
> is to ensure that where authors use [square brackets] for parenthetical
> remarks or as list denotations, the parser hasn't got to do the piece of
> jiggery-pokery that marks it as `maybe a xref' and obliges it to come
> back later to settle the maybe once it knows.  This cost will remain for
> [None], but it'll be well-defined that the parser marks it as a maybe,
> discovers that it isn't and settles on it being just text, not a
reference.
>
> Now, it seems to me that what you were describing was slightly different
...
> am I merely confused ?
>
> Eddy.
>
> _______________________________________________
> Doc-SIG maillist  -  Doc-SIG@python.org
> http://www.python.org/mailman/listinfo/doc-sig



From friedrich@pythonpros.com  Tue Nov 30 19:22:55 1999
From: friedrich@pythonpros.com (Robin Friedrich)
Date: Tue, 30 Nov 1999 13:22:55 -0600
Subject: [Doc-SIG] docstring grammar
References: 
Message-ID: <005301bf3b68$4df4f540$f25728a1@UNITEDSPACEALLIANCE.COM>

----- Original Message -----
From: David Ascher 
To: 
Sent: Tuesday, November 30, 1999 12:02 PM
Subject: RE: [Doc-SIG] docstring grammar


> On Tue, 30 Nov 1999, Tim Peters wrote:
>
> > Luckily, it almost fits your definition of a paragraph already.  It
> > shouldn't be any real effort to declare that ">>>" introduces a
> > structureless code paragraph extending until the next all-whitespace
etc --
> > given that it's a format for Python docstrings, Python's own output
deserves
> > some special treatment .
>
> The only question I suppose is whether one should require a keyword (Test:
> or other) to keep the top-level syntax trivial, or special-case the
> recognition of >>>-beginning paragraphs.
>
> I'm leaning for the former, as it can evolve to the latter if there is
> sufficient call for it from the user base, and I think it does keep the
> code simpler.  But I'm willing to be swayed.
>
> --david

I would rather minimize the invention (and consequential memorization) of
special keywords. Parsing them is not made quite as trivial as it seems
(especially when alternate languages are involved). Structured text had the
favorable trait of being very easy to remember. Parsers are built using
formal definition of special case rules anyway. Where special casing based
on context becomes non-obvious to remember is where I would draw the line
and resort to literal keywords.

-Robin



From fdrake@acm.org  Tue Nov 30 19:27:08 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 30 Nov 1999 14:27:08 -0500 (EST)
Subject: [Doc-SIG] Party!
Message-ID: <14404.9484.505714.922927@weyr.cnri.reston.va.us>

  Well, I turn my back for a few days of turkey-feasting and
kid-chasing, and what do I find when I turn back around?
  Great party on the list!  I'll try and actually read this before I
write too many posts.  ;-)


  -Fred

--
Fred L. Drake, Jr.	     
Corporation for National Research Initiatives


From da@ski.org  Tue Nov 30 19:58:05 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 11:58:05 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <004901bf3b60$30f49de0$f25728a1@UNITEDSPACEALLIANCE.COM>
Message-ID: 

On Tue, 30 Nov 1999, Robin Friedrich wrote:

> """
> Marking refs with [brackets], and at the end of the doc string place the
> annotations ala bibliography one per line. Key "brackets" is placed in the
> local namespace and used by other (lower) doc strings. In the gendoc
> implementation if the key doesn't match anything stored in the ref mapping
> no markup in done, so that things like [None]*5 are safe and no exception
> need be raised.
> 
> [brackets] -> http://www.howto.python.org/rtfm.html
> """

Nicely said.  I'd like to point out that the transformation I had in mind
is in fact, given the above and an HTML output:

[brackets] -> brackets

In other words the keyword is kept until the rendering stage. I suppose
that it might be necessary to allow the reference to define a different
bit of text to render instead of the keyword.

So given:

  """
  ...
  References:

     PythonDotOrg: 
       Text: "Python's Main Website"
       Link: http://www.python.org
  """

we could have:

[PythonDotOrg] -> Python's main website

Or not.  Luckily I think that issue can be left to the 'bibliography
engine', just like the bullet processing can be left to the 'list engine'.

--david

PS: I would suggest that the 'if no key exists, no markup is done'
    behavior be modifiable at runtime to 'a warning is emitted', as I
    think that this sort of silent behavior is problematic given the
    presence of typos in the world.



From fdrake@acm.org  Tue Nov 30 20:28:18 1999
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 30 Nov 1999 15:28:18 -0500 (EST)
Subject: [Doc-SIG] On David Ascher's Rant
In-Reply-To: <004101bf3a69$535a09d0$0501a8c0@bobcat>
References: 
 <004101bf3a69$535a09d0$0501a8c0@bobcat>
Message-ID: <14404.13154.994319.599412@weyr.cnri.reston.va.us>

Mark Hammond writes:
 > Me too - thanks Fred!  The doc is excellent and a thankless task!

  You're welcome!

  (Wow, all that talk about how thankless the job is, and this is the
first thank-you in the thread!  One free doc download for Mark!  ;)


  -Fred

--
Fred L. Drake, Jr.	     
Corporation for National Research Initiatives


From skip@mojam.com (Skip Montanaro)  Tue Nov 30 20:39:17 1999
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 30 Nov 1999 14:39:17 -0600 (CST)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: 
References: <004901bf3b60$30f49de0$f25728a1@UNITEDSPACEALLIANCE.COM>
 
Message-ID: <14404.13813.670091.131230@dolphin.mojam.com>

In David's original proposal he wrote:

    For compatibility with Guido, IDLE and Pythonwin (and increasing the
    likelihood that the proposal will be accepted by GvR), the docstrings of
    callables must follow the following convention established in Python's
    builtins:
    
         >>> print len.__doc__
         len(object) -> integer
    
         Return the number of items of a sequence or mapping.
    
    In other words, the first paragraph must fit on a line, repeat the name
    of the callable, with a 'wordy' signature, the ' -> ' string, and the
    type of the return value.

Chiming in rather late.  Perhaps this was already discussed, but I didn't
see it in the immediate followups to David's original proposal...

The one complaint I have with the wordy signature is that it partially types
the function.  It specifies a return type, but not the input parameter
types.  Why go only halfway?  I suggest you either use type names for
parameters and return value or annotate the parameter names with types:

    len(o:sequence) -> IntType

There should be a couple shorthands, for instance, using "sequence",
"mapping" or "number" to represent objects that exhibit the given behavior,
or "object" to represent an arbitrary (untyped) parameter or return value.
Otherwise, I'd suggest the types be the names defined by the types module.
Of course, I'm ignoring the types of the elements of aggregate types.  I'll
let someone smarter make a more concrete proposal in this regard.

Why worry about this?  Well, people have been asking over and over for type
information.  This looks parseable to me, doesn't change the language, yet
could be used by a type inferencer, "safer" compiler or other type-oriented
tools.

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/
847-971-7098   | Python: Programming the way Guido indented...



From friedrich@pythonpros.com  Tue Nov 30 20:38:45 1999
From: friedrich@pythonpros.com (Robin Friedrich)
Date: Tue, 30 Nov 1999 14:38:45 -0600
Subject: [Doc-SIG] docstring grammar
References: 
Message-ID: <006901bf3b72$f2ca06a0$f25728a1@UNITEDSPACEALLIANCE.COM>

From: David Ascher 
> On Tue, 30 Nov 1999, Robin Friedrich wrote:
>
> > """
> > Marking refs with [brackets], and at the end of the doc string place the
> > annotations ala bibliography one per line. Key "brackets" is placed in
the
> > local namespace and used by other (lower) doc strings. In the gendoc
> > implementation if the key doesn't match anything stored in the ref
mapping
> > no markup in done, so that things like [None]*5 are safe and no
exception
> > need be raised.
> >
> > [brackets] -> http://www.howto.python.org/rtfm.html
> > """
>
> Nicely said.  I'd like to point out that the transformation I had in mind
> is in fact, given the above and an HTML output:
>
> [brackets] -> brackets

grumble grumble...see below.
>
> In other words the keyword is kept until the rendering stage. I suppose
> that it might be necessary to allow the reference to define a different
> bit of text to render instead of the keyword.

Why? keywords are arbitrary strings. (may include spaces, etc.)
>
> So given:
>
>   """
>   ...
>   References:
>
>      PythonDotOrg:
>        Text: "Python's Main Website"
>        Link: http://www.python.org
>   """
>
> we could have:
>
> [PythonDotOrg] -> Python's main
website
>
> Or not.  Luckily I think that issue can be left to the 'bibliography
> engine', just like the bullet processing can be left to the 'list engine'.

Yes. However I really don't like the idea of HTML finding its way into the
doc string. The BiblioEngine would be told the information of the reference
and, along with what rendering mode she is in, emit the appropriate output
format, be it HTML, XML, PDF, etc.
>
> --david
>
> PS: I would suggest that the 'if no key exists, no markup is done'
>     behavior be modifiable at runtime to 'a warning is emitted', as I
>     think that this sort of silent behavior is problematic given the
>     presence of typos in the world.

Agreed.




From da@ski.org  Tue Nov 30 20:56:10 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 12:56:10 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <006901bf3b72$f2ca06a0$f25728a1@UNITEDSPACEALLIANCE.COM>
Message-ID: 

On Tue, 30 Nov 1999, Robin Friedrich wrote:

> > Nicely said.  I'd like to point out that the transformation I had in mind
> > is in fact, given the above and an HTML output:
> >
> > [brackets] -> brackets
> 
> grumble grumble...see below.
> >
> > In other words the keyword is kept until the rendering stage. I suppose
> > that it might be necessary to allow the reference to define a different
> > bit of text to render instead of the keyword.
> 
> Why? keywords are arbitrary strings. (may include spaces, etc.)

We should watch our language =).  Keywords in my proposal are things
before :'s which lead a paragraph and cannot contain whitespaces. Maybe we
don't need that restrictions on things in []'s.

> >   References:
> >
> >      PythonDotOrg:
> >        Text: "Python's Main Website"
> >        Link: http://www.python.org

> Yes. However I really don't like the idea of HTML finding its way into
> the doc string. The BiblioEngine would be told the information of the reference
> and, along with what rendering mode she is in, emit the appropriate output
> format, be it HTML, XML, PDF, etc.

I don't recall putting HTML in the docstring.  Just a URL.





From da@ski.org  Tue Nov 30 21:01:23 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 13:01:23 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <14404.13813.670091.131230@dolphin.mojam.com>
Message-ID: 

On Tue, 30 Nov 1999, Skip Montanaro wrote:

> The one complaint I have with the wordy signature is that it partially types
> the function.  It specifies a return type, but not the input parameter
> types.  Why go only halfway?  I suggest you either use type names for
> parameters and return value or annotate the parameter names with types:
> 
>     len(o:sequence) -> IntType

I propose to defer this discussion.  I think it's a fine idea in general,
but raises a whole bunch of issues, and mixes with other threads like
typing etc.  Furthermore, the current uses of this first line (popups in
IDLE and Pythonwin) might suffer from a significant lengthening of said
line.  Getting the type information in the docstring is however a worthy
goal, but perhaps best left for a subsection:

  Arguments:
     o (sequence) -- an arbitrary sequence object
     
I'd like to finalize the top-level structure, get it in front of GvR's
eyeballs, and then we can tackle each subtopic (so far: list processing,
reference handling, signature, mandatory keywords, keyword registration
process, multilingual keyword support, etc.) at a later date.

--david



From friedrich@pythonpros.com  Tue Nov 30 21:33:42 1999
From: friedrich@pythonpros.com (Robin Friedrich)
Date: Tue, 30 Nov 1999 15:33:42 -0600
Subject: [Doc-SIG] docstring grammar
References: 
Message-ID: <008501bf3b7a$93855160$f25728a1@UNITEDSPACEALLIANCE.COM>

My bad.
----- Original Message -----
From: David Ascher 
> > > [brackets] -> brackets

I was interpreting the above as a doc string rewrite of my
[brackets] -> http://www.howto.python.org/rtfm.html
*in* the doc string.  Sorry.

> > Why? keywords are arbitrary strings. (may include spaces, etc.)
>
> We should watch our language =).  Keywords in my proposal are things
> before :'s which lead a paragraph and cannot contain whitespaces. Maybe we
> don't need that restrictions on things in []'s.
>
> > >   References:
> > >
> > >      PythonDotOrg:
> > >        Text: "Python's Main Website"
> > >        Link: http://www.python.org

Hmmm.  Gosh we need a glossary quick! Yup, we had different notions of
"keyword".
Do you really want arbitrary DAkeywords (stuff before colons) usable for
internal/external references?  Since this confused me, I might conclude that
it would confuse others as well.
I would have placed the following in my doc string and been satisfied...
""".....
    For further information visit:
        [Python Language Web Site] is the main source for Python itself.
        [Starship Python] houses a number of Python user resources.

[Python Language Web Site] -> http://www.python.org
[Starship Python] -> http://starship.python.net
"""
Intuitively I don't think of the word "visit" as a keyword that can be
referenced, while anything in brackets seems fair game. What other features
did you have in mind?
Dejavu'ly yours,
Robin



From uche.ogbuji@fourthought.com  Tue Nov 30 22:01:04 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 30 Nov 1999 15:01:04 -0700
Subject: [Doc-SIG] docstring grammar
In-Reply-To: Your message of "Sun, 28 Nov 1999 16:57:03 PST."
 
Message-ID: <199911302201.PAA02300@localhost.localdomain>

> Proposed format for docstrings:
> 
>   The whitespace at the beginning of a docstring is ignored.
> 
>   Paragraphs are separated by one or more blank lines.
> 
>   For compatibility with Guido, IDLE and Pythonwin (and increasing the
>   likelihood that the proposal will be accepted by GvR), the
>   docstrings of callables must follow the following convention
>   established in Python's builtins:
> 
>        >>> print len.__doc__
>        len(object) -> integer
> 
>        Return the number of items of a sequence or mapping.

The only thing I'd _maybe_ suggest in order to allow some structure is to 
eliminate the non-keyword sections:

        >>> print len.__doc__
        sig:: len(object) -> integer
 
        desc:: Return the number of items of a sequence or mapping.

I know this loses a bit from the point of view of the user's readability, but 
it would provide some structure which increases the author's flexibility, and 
makes conversion to "library format" easier.

Otherwise, your proposal seems a good start.

> Miscellaneous Thoughts:
> 
>   I chose double-colon notation for keywords so that one can have text
>   paragraphs which match the 'word:' notation without having them be
>   interpreted as keywords.

There are other conventions that would work, but '::' is as good as any.

>   Does this proposal make docstrings whitespace-heavy -- the
>   requirement to break each paragraph with a line of whitespace
>   means that a lot of lines are blank, especially when doing
>   'bulleted lists'

I would suggest dropping the requirement, which can be done if everything is 
keyword-modified.

>   The above was (quickly) written with parsing in mind.  Is it really
>   easily parseable?  If not, what needs to be changed so that it is
>   parseable?

I see no major parsing problems.  Bullets might be a bit of a bore, but 
nothing to kill progress.

>   Are there normal uses in docstrings where one wants to turn off the
>   automatic link detection?

I think we can come up with a basic escaping mechanism for this.  Maybe by 
preceding not-to-be-processed URLs and link keywords with '!'.

>   Is there value in having string interpolation?  David Arnold mentioned
> 
>        __version__ = "$Revision$[11:-2]
>        __date__ = "$Date$

I'd say leave this to a later version.

> PS: It goes without saying that while I railed against design by
> committee, I am of course hopeful for feedback, for technical reasons
> (dummy, you forgot special cases X, Y and Z!) and because I realize that a
> standards proposal needs at least broad agreement if not consensus to be
> effective in the long run.  The sharper-eyed will note that I stacked the
> deck in my favor in the above proposal by including what Guido does
> naturally as valid in the proposed grammar.

Damn the politics.  Full speed ahead.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org




From da@ski.org  Tue Nov 30 22:06:39 1999
From: da@ski.org (David Ascher)
Date: Tue, 30 Nov 1999 14:06:39 -0800 (Pacific Standard Time)
Subject: [Doc-SIG] docstring grammar
In-Reply-To: <008501bf3b7a$93855160$f25728a1@UNITEDSPACEALLIANCE.COM>
Message-ID: 

On Tue, 30 Nov 1999, Robin Friedrich wrote:

> Hmmm.  Gosh we need a glossary quick! Yup, we had different notions of
> "keyword".

  A keyword is a case-sensitive string which:
      - starts a paragraph
      - matches  '^ *[a-zA-Z_]+[\-a-zA-Z_0-9]*: +' 
        (Python identifiers with the addition of hyphens and which end
        with a : and one or more spaces)

As (I think it was) Tibs mentioned, it's syntactic sugar for XML
notation, with the same aim of making a 'labeled' hierarchy.  Maybe the
word 'Label' is better.

  Foo:
    this is the body of foo
    which spans multiple lines

is isomorphic to

  
  this is the body of foo
  which spans multiple lines
  

> Do you really want arbitrary DAkeywords (stuff before colons) usable for
> internal/external references?  Since this confused me, I might conclude that
> it would confuse others as well.

No.  I intend only the DAKeywords listed in a special "References:"
section to be available as the targets of references (see below).

> I would have placed the following in my doc string and been satisfied...
> """.....
>     For further information visit:
>         [Python Language Web Site] is the main source for Python itself.
>         [Starship Python] houses a number of Python user resources.
> 
> [Python Language Web Site] -> http://www.python.org
> [Starship Python] -> http://starship.python.net
> """

This is, I would assume, harder to parse -- you must have some implicit
rules in there regarding which [Starship Python] is a 'mention of
something else' and which is a 'this is the thing I mentioned'.  Is it the
sequential order, the 0-indent?

My vision for the same semantics as above was:

 """.....
      For further information visit:
         [PythonLanguageWebSite] is the main source for Python itself.
         [StarshipPython] houses a number of Python user resources.

      References:
         PythonLanguageWebSite:  http://www.python.org
         StarshipPython: http://starship.python.net
 """

Which leaves open the question of how we can have 'space-enabled' labels
for references which can't have spaces in them.  

One idea is to tag the [] markup with a ="stringlabel":

         [PythonLanguageWebSite="The Python.org website"] is the main
          source for Python itself.

Another possibility hinted at previously is to enrich the References
section:

     References:
        PythonLanguageWebSite:
          Label: The Python.org website
          Link: http://www.python.org

either of which, when rendered, would 'do the right thing.  I only expect
this to be an issue when referring to URLs.  Python modules, classes and
functions already have perfectly good names.  For things which are more
like *real* bibliographic references, I'd be just as happy with the
conventional [keyword] notation seen in many CS papers.

     See [ascher29] for the source of the algorithm.

     References:
       ascher29: My famous Ph.D. Dissertation, Foo University, 2029.

which would get rendered just the way it looks on your screen even in a
printed format.

> Intuitively I don't think of the word "visit" as a keyword that can be
> referenced, while anything in brackets seems fair game. What other features
> did you have in mind?

I don't understand the above paragraph.  The word 'visit' isn't a
DAKeyword because it wasn't starting a paragraph.

--david

PS: I'm working on updating the proposal, but I have other pressing
    deadlines (such as getting the JPython tutorial ready for IPC8!), so
    it may not be ready for a couple of days.




From uche.ogbuji@fourthought.com  Tue Nov 30 22:27:16 1999
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 30 Nov 1999 15:27:16 -0700
Subject: [Doc-SIG] docstring grammar
In-Reply-To: Your message of "Mon, 29 Nov 1999 09:50:49 GMT."
 <000701bf3a4f$375a85d0$f0c809c0@lslp7o.lsl.co.uk>
Message-ID: <199911302227.PAA02358@localhost.localdomain>

> David Ascher wrote:
> >   Paragraphs are separated by one or more blank lines.
> 
> As you say later on, I think this does cause some over-use of whitespace...

Agreed.  Let's kill them.

> >   Characters between # signs and the end of the line are stripped by
> >   the docstring parser.
> 
> This is a Bad Thing - I have quite often needed to discuss things in doc
> strings which include use of the "#" character - not least if I'm parsing a
> little language that uses "#" as its comment character! So losing stuff thus
> would be difficult. Either (a) why do we need comments in doc strings, or
> (b) provide a way to escape the "#" character.

I forgot to mention this in my original reply.  I also think that this is a 
bad idea.  I don't think we need meta-comments for the doc-strings.  I don't 
like the idea even if we find a way to escape '#'.

> but the above gets oververbose. I suppose one could instead use a list
> syntax:
> 
> 	Contributors::
> 		- John Doe
> 		- Ronald Reagan
> 		- Francois Mitterand

Yes, and this goes with what David had in his proposal about bullets.

> since I don't see the ambiguity in allowing the omission of the vertical
> whitespace here, *if* one allows that some care would be needed with
> hyphenation! (i.e., one can't allow one's hyphens to start a line, which is
> awkward but probably not too bad). Another possibility might be to allow
> "Python list" syntax - I started off disliking this, but over the last few
> minutes it has grown on me:
> 
> 	Contributors::
> 		[ John Doe,
> 		  Ronald Reagan,
> 		  Francois Mitterand ]
> 
> (again, highjacking Python's syntax).

Again as long as we don't go having meta-compilation in the first version of 
the system.

> No, on thinking about it, I would vote for either:
> 
> 	1) use of white space as David proposes
> 	   (pro: utter simplicity,
> 	    con: doesn't quite look as nice as I'd like)
> 	2) allow Python list syntax
> 	   (pro: emphasises this is for short lists,
> 	    con: a bit odd)
> 	3) detect bullet characters at the "start of line"
> 	   (pro: still fairly simple,
> 	    con: one has to take care about, e.g., dashes in text)
> 	   Ah - I just realised that negative numbers at the start of a line
> 	   probably kill that one...

This one is also a bit ugly, but how about a hybrid:

List
	[
	* item 1
	* item 2
		[
		* sub-item 1
		* sub-item 2
		]
	* item 3
	]


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org