From MHammond@skippinet.com.au Mon Mar 2 06:49:14 1998 From: MHammond@skippinet.com.au (Mark Hammond) Date: Mon, 2 Mar 1998 16:49:14 +1000 Subject: [DOC-SIG] Xrefs Message-ID: <025901bd45a7$9b7e42c0$0a01a8c0@skippy.skippinet.com.au> -----Original Message----- From: Robin Friedrich To: Doc-SIG Date: Tuesday, 24 February 1998 7:59 Subject: Re: [DOC-SIG] Xrefs > ... >.. [O'Reilly] http://www.oreilly.com/catalog/html2/index.html >for an ordinary URL, or: >.. [Wizbang] win32api:spamDialog.toolBar.wizBangPrime >for an external reference object called win32api, or: >.. [Geo-model] self.GeoPotentials.Model >to point to an object within our current package with an absolute path. >Since we mandate full pathing to python objects we don't need to specify what the >stating point of the reference is. Note also that the bracket highlighted text >does not have to correspond to the object it's pointing to. This does seem pretty good, and the last word so far on the matter. This fits well with the existing code, and I imagine that if someone _really_ wanted the completely inline style, they could implement it with "<" tags. Im probably way over generalising, but it does seem a good idea to reserve some tokens for additional growth - I dont know - say an inline image reference (ok - bad example - umm - anyway). Maybe we could reserve "<" and ">" as special indicators to allow future growth. At this stage, all it means is that you must escape literal "<" and ">"... Either way, the scheme above seems to meet my requirements, so we can steam ahead :-) How exactly do we get started (he says, suffering work overload as it is :-) Are there still design issues to resolve? eg, I saw the note saying "you must explicitly reference all objects". I dont like that idea - IMO, you should require the same level of reference that the code would. Eg: def foo(): pass def bar()" """ See also [foo function] ..[foo function] foo # Should not need to know the full location Or did I mis-understand? Thanks, Mark. _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From friedrich@pythonpros.com Mon Mar 2 13:52:09 1998 From: friedrich@pythonpros.com (Robin Friedrich) Date: Mon, 02 Mar 1998 07:52:09 -0600 Subject: [DOC-SIG] Xrefs References: <025901bd45a7$9b7e42c0$0a01a8c0@skippy.skippinet.com.au> Message-ID: <34FAB989.2082CE47@pythonpros.com> Mark Hammond wrote: > Im probably way over generalising, but it does seem a good idea to reserve > some tokens for additional growth - I dont know - say an inline image > reference (ok - bad example - umm - anyway). Maybe we could reserve "<" and > ">" as special indicators to allow future growth. At this stage, all it > means is that you must escape literal "<" and ">"... I might add here that such tokens need not be escaped because we have adopted the structured text approach of infering markup based on context. For example if we want to assign special meaning to then the rule would be written that they are only special when set off by whitespace on only the outside. That means would be special and 1< x < 5 would not. (or something to that effect) I don't want to see 1 \< x \< 5 in my doc strings! But still I will always defend the policy of minimizing inline markup on legibility grounds. > Either way, the scheme above seems to meet my requirements, so we can steam > ahead :-) How exactly do we get started (he says, suffering work overload > as it is :-) Are there still design issues to resolve? Not really. Now some more detailed architecture (APIs etc) are in order. Daniel outlined the classes a while back and we need to follow up on that, resulting in a detailed module/package division of functions and a firm API for the objects. > eg, I saw the note saying "you must explicitly reference all objects". I > dont like that idea - IMO, you should require the same level of reference > that the code would. Eg: > > def foo(): > pass > def bar()" > """ See also [foo function] > ..[foo function] foo # Should not need to know the full location > > Or did I mis-understand? didn't misunderstand. That's because it was thought that it would make it easier for the reader (of the source) to know which object without having to calculate the python scoping rules and deduce the correct path. Maybe this is unnecessary. It would make things somewhat more complicated for gendoc though. If you guys think it's easy enough fine, (i hearby volunteer NOT to code this bit). -- Robin K. Friedrich Houston, Texas Python Professional Services, Inc. friedrich@pythonpros.com http://www.pythonpros.com _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From da@skivs.ski.org Tue Mar 3 02:35:27 1998 From: da@skivs.ski.org (David Ascher) Date: Mon, 2 Mar 1998 18:35:27 -0800 (PST) Subject: [DOC-SIG] Xrefs In-Reply-To: <34FAB989.2082CE47@pythonpros.com> Message-ID: On Mon, 2 Mar 1998, Robin Friedrich wrote: > > eg, I saw the note saying "you must explicitly reference all objects". I > > dont like that idea - IMO, you should require the same level of reference > > that the code would. Eg: > > > > def foo(): > > pass > > def bar()" > > """ See also [foo function] > > ..[foo function] foo # Should not need to know the full location > > > > Or did I mis-understand? > > didn't misunderstand. That's because it was thought that it would make it > easier for the reader (of the source) to know which object without having to > calculate the python scoping rules and deduce the correct path. Maybe this is > unnecessary. It would make things somewhat more complicated for gendoc though. > If you guys think it's easy enough fine, (i hearby volunteer NOT to code this > bit). Do we really want to limit ourselves to Python scoping rules (e.g. the two-scope rule) in a purely textual description? It strikes me that Python's rules, which make some sense in the context of evaluated code, make very little sense in the context of documentation. E.g. I think it'd be nice to be able to have: class Klass def f3(): print 'Ni!' def f1(): foo = 'SPAM!' def f2(): """ and here I refer to f1, Klass, f2, Klass.f3, and f1.foo """ something... After all, .py files are pretty static... --da _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From papresco@technologist.com Wed Mar 11 14:28:51 1998 From: papresco@technologist.com (Paul Prescod) Date: Wed, 11 Mar 1998 09:28:51 -0500 Subject: [DOC-SIG] What does this mean for Python? Message-ID: <35069FA2.1AE651AC@technologist.com> http://www.perl.com/perl-xml.html ---- How to Make Perl The Language of Choice for XML Perl has been the language of choice for anyone doing serious text processing. Now efforts are underway to make Perl the language of choice for those doing "structured" text processing using the Extensible Markup Language (XML). The XML 1.0 specification was recently (Feb. 10, 1998) released as a recommendation by the World Wide Web Consortium. XML is a subset of SGML (Standard Generalized Markup Language) and it seems to be emerging as a universal syntax for defining non-proprietary document markup and data formats. XML made significant changes to SGML to reflect the nature of the Web and to make it easier to build tools that process XML. Tim Bray, co-editor of the XML 1.0 specification, has used Perl extensively for huge text processing applications. He had a special interest in seeing a bridge built from Perl to XML -- one that would make it simple for programmers to process XML data. So, out of this interest, a small group of developers met at O'Reilly & Associates in Sebastopol, California for a one-day Perl/XML summit. In addition to Tim, those attending the summit were: Larry Wall, creator of Perl, and senior developer, O'Reilly & Associates Dick Hardt, developer of Perl for Win 32, and Chief Technology Officer, ActiveState Tool Corp. Tim O'Reilly, President and CEO, O'Reilly & Associates Dale Dougherty, CEO, Songline Studios Gina Blaber, Director, Software Products Group, O'Reilly & Associates. "In the design of XML, we were continuously mindful of the need to enable the fast, efficient creation of scripts and programs for processing XML," says Tim Bray. ---- My commentary: Perl has nothing to recommend it over Python right now. In fact this February's Dr. Dobbs already has an article on Python and XML. The only snag is Unicode support. Perl doesn't have Unicode support but Larry has promised it. "One of the summit group's first priorities is to get Perl working with Unicode (ISO 1046). Unicode enables code to be easily translated into other languages; XML requires Unicode. Larry Wall will lead the team working on this task." Many people equate CGI and Perl. I would hate to see that happen with XML and hope I can help to stop that from happening. In the short term, I will integrate JPython with a Java XML parser and write a tutorial on how to use that (JPython inherits Unicode support from Java, right?). Paul Prescod - http://itrc.uwaterloo.ca/~papresco Can we afford to feed that army, while so many children are naked and hungry? Can we afford to remain passive, while that soldier-army is growing so massive? - "Gabby" Barbadian Calpysonian in "Boots" _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From akuchlin@CNRI.Reston.VA.US Wed Mar 11 14:55:26 1998 From: akuchlin@CNRI.Reston.VA.US (Andrew Kuchling) Date: Wed, 11 Mar 1998 09:55:26 -0500 Subject: [DOC-SIG] [STRING-SIG] What does this mean for Python? In-Reply-To: <35069FA2.1AE651AC@technologist.com> References: <35069FA2.1AE651AC@technologist.com> Message-ID: <199803111455.JAA23221@newcnri.CNRI.Reston.Va.US> Paul Prescod writes: >How to Make Perl The Language of Choice for XML Thanks for finding this, Paul. Now, how should we respond? 1) The String-SIG's been pretty dead lately; I've been posting the odd bugfix patch for the PCRE code, and that's about it. Can we please start considering a Unicode string type? This would kill two birds with one stone, since Unicode is important both for XML and for Mark Hammond's PythonWin. 2) The JPython idea is a good one. 3) What about XML support for CPython? I'd like to be able to do XML processing without requiring external programs such as SP or nsgmls. Writing an XML DTD parser, and after that a well-formedness verifier, has therefore been on my project list for a bit. I'll push it up in importance. Once we can parse DTDs, we could write an XML parser that created a tree (or grove, or whatever the precise terminology is) for a document. (A module that read SP's output would still be useful, of course.) 4) What else is there that could be done? Perhaps, if the attempt to convert the documentation to XML is begun, that large application will drive development of further XML tools. What seem like useful deliverables? A.M. Kuchling http://starship.skyport.net/crew/amk/ Dream casts a human shadow, when it occurs to him to do so. -- From SANDMAN: "Season of Mists", episode 0 _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From hugunin@CNRI.Reston.VA.US Wed Mar 11 15:27:27 1998 From: hugunin@CNRI.Reston.VA.US (Jim Hugunin) Date: Wed, 11 Mar 1998 10:27:27 -0500 Subject: [DOC-SIG] Re: [STRING-SIG] What does this mean for Python? References: <35069FA2.1AE651AC@technologist.com> Message-ID: <3506AD5F.9AC854AD@cnri.reston.va.us> Paul Prescod wrote: > I will integrate JPython with a Java XML parser and write a tutorial on > how to use that (JPython inherits Unicode support from Java, right?). This sounds cool. JPython does inherit Unicode support from Java in the standard string objects. The string and re modules are also designed to handle Unicode strings. I should warn you that I haven't tested this functionality much at all (the curse of being an English speaker). -Jim _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From jefu@k2.knowledge2000.com Wed Mar 11 15:44:50 1998 From: jefu@k2.knowledge2000.com (Jefu!) Date: Wed, 11 Mar 1998 08:44:50 -0700 Subject: [DOC-SIG] [STRING-SIG] What does this mean for Python? In-Reply-To: Your message of "Wed, 11 Mar 1998 09:55:26 EST." <199803111455.JAA23221@newcnri.CNRI.Reston.Va.US> Message-ID: <199803111544.IAA30127@k2.knowledge2000.com> I think that perhaps the most useful thing that can be done to pry xml away from the perl evangelists would be to work on making XSL a more attractive alternative. To do this XSL probably needs to be seen as a general XML to XML transformation language (or perl (PML?) will fill that niche). Then construct a version of XSL with embedded Python (for string manipulation and suchlike) and with Python objects which could be used for constructing XML structures. Finally build a simple display engine and voila.... (I've been actually thinking about doing something like this myself maybe trying to use jade as a back end but jade is so complex...) Though it seems so futile to resist the perl-borg juggernaut. I had to fight hard to write cgi scripts in Python instead of perl - after all "perl is the only language for CGI" -- jeff putnam - jefu@knowledge2000.com - knowledge 2000 _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From papresco@technologist.com Wed Mar 11 16:18:16 1998 From: papresco@technologist.com (Paul Prescod) Date: Wed, 11 Mar 1998 11:18:16 -0500 Subject: [DOC-SIG] [STRING-SIG] What does this mean for Python? References: <35069FA2.1AE651AC@technologist.com> <199803111455.JAA23221@newcnri.CNRI.Reston.Va.US> Message-ID: <3506B948.A2DB2A23@technologist.com> On the Python marketing side, we could actually use this as an opportunity to get some publicity (if that interests the powers that be). Media loves a horse race and if we promote comparisons with Perl, people will at least know that Perl isn't the only language in the class. "We have nothing to lose but our obscurity." Andrew Kuchling wrote: > > 1) The String-SIG's been pretty dead lately; I've been posting > the odd bugfix patch for the PCRE code, and that's about it. Can we > please start considering a Unicode string type? This would kill two > birds with one stone, since Unicode is important both for XML and for > Mark Hammond's PythonWin. And also more generally for being the best scripting language in the world :) (and not just in English speaking countries). > 3) What about XML support for CPython? I'd like to be able to > do XML processing without requiring external programs such as SP or > nsgmls. Writing an XML DTD parser, and after that a well-formedness > verifier, has therefore been on my project list for a bit. I'll push > it up in importance. Once we can parse DTDs, we could write an XML > parser that created a tree (or grove, or whatever the precise > terminology is) for a document. (A module that read SP's output would > still be useful, of course.) I've written that latter (nsgmls output) module. I haven't done a lot of Python work since the Rise of XML, so I have nothing XML specific. I would say that instead of writing an XML parser in Python (probably not fast enough), or writing one from scratch in C (a bunch of needless work), we should start with James Clark's XMLTok, which is written in ANSI C. The "tree" you describe should probably be a W3C DOM[1]. That spec. isn't totally solid yet, but it is usually better to conform to a shifting standard than a completely proprietary API of your own designing. There should also be an event interface based on SAX[2]. [1] http://www.w3.org/TR/WD-DOM/ [2] http://www.microstar.com/XML/SAX/ I think that JPython gets most of this for "free" with a very little bit of glue. All I need to do is document how to use the glue. Paul Prescod - http://itrc.uwaterloo.ca/~papresco Can we afford to feed that army, while so many children are naked and hungry? Can we afford to remain passive, while that soldier-army is growing so massive? - "Gabby" Barbadian Calpysonian in "Boots" _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From digitome@iol.ie Wed Mar 11 15:58:22 1998 From: digitome@iol.ie (Sean Mc Grath) Date: Wed, 11 Mar 1998 15:58:22 GMT Subject: [DOC-SIG] [STRING-SIG] What does this mean for Python? Message-ID: <199803111558.PAA21797@mail.iol.ie> >Paul Prescod writes: >>How to Make Perl The Language of Choice for XML [Press release snipped] > [Andrew Kuchling] > Thanks for finding this, Paul. Now, how should we respond? I was hoping to un-bury myself enough to get started on some of the stuff below but so far no joy! With the help of experienced Python extension module developers we can make a lot of progress in a short time on this one and we *need* to. I know I speak for Paul Prescod and other SGML/XML people who are Python fans in saying that any XML related help needed by Python people working on this stuff will be provided by us lot double quick! Guys, this XML thing is really moving. Python has move power in its little finger for XML processing than pretty much any language I can think of. There is an opportunity here to grab the XML wave and show Python off to the world as the killer language it is. The is also an opportunity here to be lost. Time is of the essence! 1. We need to take James Clark's C implementation of a non-validating XML parser and wrap it as Python extension module. Until such time as Python does Unicode it will only be able to handle 8 bit character sets and good old UTF-8. James has specifically designed it to be integrated into other applications. I do not think this would take very long and was hoping to have a shot at it myself:-( Volunteer C extension developers, please take one step forward. 2. We need to provide a SAX based interface to the parser (event based) (SAX is an emerging standard API for XML parsers) 3. We need to provide a DOM interface (tree based) DOM = Document Object Model - a W3C initiative to develop a language independent read/write interface to HTML/XML documents 4. We need to move on Unicode 5. We need to move on accessing Java XML parsers via JPython (go for it Paul!) 6. I need to get the finger out and make Lumberjack freely available 7. We need to implement the XSL stylesheet language using Python as the scripting environment instead of JavaScript I have a small amount of Python stuff in an upcoming book an XML. I could have a helluva lot more if the XML non-validating extension module existed, occupied space and exerted gravitational force. Sean _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From larsga@ifi.uio.no Wed Mar 11 16:55:39 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 11 Mar 1998 17:55:39 +0100 Subject: [DOC-SIG] What does this mean for Python? In-Reply-To: <35069FA2.1AE651AC@technologist.com> References: <35069FA2.1AE651AC@technologist.com> Message-ID: * Paul Prescod | | Many people equate CGI and Perl. I would hate to see that happen | with XML and hope I can help to stop that from happening. In the | short term, I will integrate JPython with a Java XML parser and | write a tutorial on how to use that For the last week I've been working on a validating XML parser in pure Python that builds a grove-like tree. Initially, I built on xmllib and added a DTD parser, a catalog file handler and some grove objects. I'm now making my own replacement for xmllib (xmlproc), which I'm planning to integrate with the existing validator/grove builder. At this stage I'm able to produce basic ESIS output from xmlproc and the validator/tree builder is advanced enough to parse Tim Brays plays and religious texts. In fact I wrote a small script that went through the tree and counted the number of speeches and lines for each character in a play.[1] The downside is that it takes 50 seconds to parse Hamlet (275 k) on my Win95 Pentium 166MHz, which is much too slow. This is how I envisioned my XML package: 1) SAX driver for xmllib 2) xmlproc uses SAX natively instead of using a driver, although it will probably need to add some things beyond SAX later That gives us well-formedness-checking and a simple standardized event-based API. Building on that I'd planned on making: 1) A simple ESIS outputter, for demo/testing purposes. 2) A grove builder, eventually with DOM support, although there are things I dislike about DOM. 3) A validator. I also wanted to be able to have groves, validation or both. The main catches I see here are: 1) Lack of Unicode support 2) Lack of speed IMHO the solution to the speed is to do the xmllib/xmlproc part in C, possibly via XMLTok, like Paul suggested. I think we should have a Python version of this as well, and thanks to SAX, we can have our cake and eat it too. Given the reaction from people to this Perl thing I'm uncertain as to what I should do. Perhaps I should rush out a minimal package consisting of a SAX shell, an ESIS outputter building on it and a SAX driver for xmllib? That would give any C volunteers something to build towards and those who want to deal with the grove/validation part something to build from. What say ye, good people? [1] Hamlet has 359 speeches and 1459 lines, more than three times what any other character in Hamlet/Tempest/Romeo&Juliet has. Kenneth Branagh must have a photograpic memory. :) -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Fred L. Drake, Jr." References: <35069FA2.1AE651AC@technologist.com> Message-ID: <199803111700.MAA18227@weyr.cnri.reston.va.us> Lars Marius Garshol writes: > Given the reaction from people to this Perl thing I'm uncertain as to > what I should do. Perhaps I should rush out a minimal package > consisting of a SAX shell, an ESIS outputter building on it and a SAX > driver for xmllib? That would give any C volunteers something to build > towards and those who want to deal with the grove/validation part > something to build from. I'm willing to do the xmltok C module. It will be a week or so before I can get to it; I really need to get the Python documentation source distribution finished up. -Fred -- Fred L. Drake, Jr. fdrake@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive Reston, VA 20191 _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From akuchlin@CNRI.Reston.VA.US Wed Mar 11 17:11:07 1998 From: akuchlin@CNRI.Reston.VA.US (Andrew Kuchling) Date: Wed, 11 Mar 1998 12:11:07 -0500 Subject: [DOC-SIG] [STRING-SIG] What does this mean for Python? In-Reply-To: <3506B948.A2DB2A23@technologist.com> References: <35069FA2.1AE651AC@technologist.com> <199803111455.JAA23221@newcnri.CNRI.Reston.Va.US> <3506B948.A2DB2A23@technologist.com> Message-ID: <199803111711.MAA27067@newcnri.CNRI.Reston.Va.US> [This is the last message I'll be cross-posting to both the Doc-SIG and String-SIG. The Doc-SIG is "a forum for discussing both the form and content of Python documentation" (from Michael McLay's description) and not document processing in general, so I conclude that the String-SIG is more appropriate.] Paul Prescod writes: >be). Media loves a horse race and if we promote comparisons with Perl, >people will at least know that Perl isn't the only language in the >class. "We have nothing to lose but our obscurity." We also promote flame wars. It's better to stand alone, and to demonstrate how much simpler the job is in Python. >I would say that instead of writing an XML parser in Python (probably >not fast enough), or writing one from scratch in C (a bunch of needless >work), we should start with James Clark's XMLTok, which is written in >ANSI C. Fred Drake just said much the same thing to me, but I'm interested in pure Python processing for particular personal purposes. I'd like to be able to do XML processing on various machines, from my home machine to the ones at work to starship, preferably without having to install C extensions or external SGML parsers. Perhaps we can follow string/strop's lead, and provide a Python version, replacing it with a faster-but-compatible version if the C extension is available. >I think that JPython gets most of this for "free" with a very little bit >of glue. All I need to do is document how to use the glue. So there's one deliverable. Another deliverable: an XML-HOWTO which provides an overview of Python and XML processing. I'll happily work on that. Sean McGrath wrote about XMLTok: >James has specifically designed it to be integrated into other >applications. I do not think this would take very long and was >hoping to have a shot at it myself:-( Volunteer C extension >developers, please take one step forward. So that's probably another deliverable: a C interface to XMLTok. I just received Lars Marius Garshol's message; that code is certainly going to be worth a look when it's released, and perhaps it can be made to use XMLtok if available. The November issue of Linux Journal will be about Web programming languages, and they've already agreed to one Python article; another one about XML would probably interest them, too. I'll commit to that as well. Deliverables: * JPython glue and documentation * XML HOWTO * C interface to XMLTok * Code to parse a document and return a grove. * At least one magazine article about XML & Python. A.M. Kuchling http://starship.skyport.net/crew/amk/ What a terrible thing to have lost one's mind. Or not to have a mind at all. How true that is. -- J. Danforth Quayle _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Robin.K.Friedrich@USAHQ.UnitedSpaceAlliance.com Wed Mar 11 17:26:19 1998 From: Robin.K.Friedrich@USAHQ.UnitedSpaceAlliance.com (Friedrich, Robin K) Date: Wed, 11 Mar 1998 11:26:19 -0600 Subject: [DOC-SIG] [STRING-SIG] What does this mean for Python? Message-ID: (Note: I think this is a Great Project, but...) I would propose that the web-sig is the most appropriate current SIG for this XML project. Just my $.02. Any reasons why not? >---------- >From: Andrew Kuchling[SMTP:akuchlin@cnri.reston.va.us] >Sent: Wednesday, March 11, 1998 11:11 AM >To: string-sig@python.org; Doc-SIG >Subject: Re: [DOC-SIG] [STRING-SIG] What does this mean for Python? > >[This is the last message I'll be cross-posting to both the Doc-SIG > and String-SIG. The Doc-SIG is "a forum for discussing both the form > and content of Python documentation" (from Michael McLay's > description) and not document processing in general, so I conclude > that the String-SIG is more appropriate.] _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From mskow@earthling.net Wed Mar 11 19:49:58 1998 From: mskow@earthling.net (Mike) Date: Wed, 11 Mar 1998 14:49:58 -0500 (EST) Subject: [DOC-SIG] [STRING-SIG] What does this mean for Python? In-Reply-To: Message-ID: where is the web-sig? can't find a ref to it at the python site. Mike On Wed, 11 Mar 1998, Friedrich, Robin K wrote: > (Note: I think this is a Great Project, but...) I would propose that > the web-sig is the most appropriate current SIG for this XML project. > Just my $.02. Any reasons why not? > > >---------- > >From: Andrew Kuchling[SMTP:akuchlin@cnri.reston.va.us] > >Sent: Wednesday, March 11, 1998 11:11 AM > >To: string-sig@python.org; Doc-SIG > >Subject: Re: [DOC-SIG] [STRING-SIG] What does this mean for Python? > > > >[This is the last message I'll be cross-posting to both the Doc-SIG > > and String-SIG. The Doc-SIG is "a forum for discussing both the form > > and content of Python documentation" (from Michael McLay's > > description) and not document processing in general, so I conclude > > that the String-SIG is more appropriate.] > > _______________ > DOC-SIG - SIG for the Python Documentation Project > > send messages to: doc-sig@python.org > administrivia to: doc-sig-request@python.org > _______________ > _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From janssen@parc.xerox.com Wed Mar 11 21:25:15 1998 From: janssen@parc.xerox.com (Bill Janssen) Date: Wed, 11 Mar 1998 13:25:15 PST Subject: [DOC-SIG] Re: [STRING-SIG] What does this mean for Python? In-Reply-To: <3506AD5F.9AC854AD@cnri.reston.va.us> References: <35069FA2.1AE651AC@technologist.com> <3506AD5F.9AC854AD@cnri.reston.va.us> Message-ID: Excerpts from ext.python: 11-Mar-98 [DOC-SIG] Re: [STRING-SIG] .. Jim Hugunin@CNRI.Reston. (626*) > JPython does inherit Unicode support from Java in the > standard string objects. Hmmm. In Python, strings are also used to represent arbitrary byte sequences. Will this feature interact unfortunately with the Unicode-based string support? Bill _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From larsga@ifi.uio.no Wed Mar 11 23:15:38 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 12 Mar 1998 00:15:38 +0100 Subject: [DOC-SIG] saxlib Message-ID: (Which list is the correct one for XML processing? Is there a Web SIG? Should there be an XML SIG? There certainly seems to be interest in this now.) I've now made a skeleton SAX library to see how SAX translates to Python, as a convenience for others doing SAX-related stuff in Python and as a focal point for further development. I found that SAX translated quite naturally to Python and that the xml-dev developers had IMHO done a very good job. The library provides minimal base classes with the default behaviour prescribed by the SAX spec and a trivial SAX-based ESIS printer. The only deviations from the letter of the specification are these: - HandlerBase has a superclass that is not described in SAX in order to make subclassing HandlerBase a little easier. This was done because HandlerBase has been made to ignore all unknown method calls, which might not be desirable in subclassed handlers. (As I write this I'm beginning to think that I should instead add empty methods to the class to avoid this problem. Opinions?) - Two methods have been added to AttributeMap: __getitem__ and keys, in order to make it possible to use AttributeMap as an ordinary Python hash table. Comments of all kinds are most welcome. The URL is Also, if the __getitem__ and keys additions are accepted as part of the Python SAX specification, perhaps we should make a PySAX specification that describes the additions to the generic SAX? Now I'll get some sleep and tomorrow I'll start working on a SAX driver for xmllib. I'll also add a command-line interface so that we get a demo ESIS-producing XML parser. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From fermigie@math.jussieu.fr Wed Mar 11 23:36:40 1998 From: fermigie@math.jussieu.fr (Stefane Fermigier) Date: Thu, 12 Mar 1998 00:36:40 +0100 Subject: [DOC-SIG] What does this mean for Python? In-Reply-To: <199803111558.PAA21797@mail.iol.ie>; from Sean Mc Grath on Wed, Mar 11, 1998 at 03:58:22PM +0000 References: <199803111558.PAA21797@mail.iol.ie> Message-ID: <19980312003640.62557@riemann.math.jussieu.fr> On Wed, Mar 11, 1998 at 03:58:22PM +0000, Sean Mc Grath wrote: > > 3. We need to provide a DOM interface (tree based) > DOM = Document Object Model - a W3C initiative to develop a > language independent read/write interface to HTML/XML documents I have put a very protypical DOM package at the URL http://www.math.jussieu.fr/~fermigie/python There are some DOM objects (not everyone, for reasons stated in the README file), a builder, a transformer and a lineariser. This is the result of several hours of hacking over a period of more than 3 months, and I won't have the time to improve the result significantly in the near future, that's why I'm releasing it as it is now. Of course, I'm eagerly waiting for comments. (I can justify some of the decisions, for example: why did I use Dan Connoly's parser? Well it was the only one available 9 months ago when I stated hacking XML, and it can also parse a reasonnable subset of HTML, which is good for my legacy Web site). Cheers, S. -- St�fane Fermigier, MdC � l'Universit� Paris 7. Tel: 01.44.27.61.01 (Bureau). Mathematician, hacker, bassist. http://www.math.jussieu.fr/~fermigie/ "He who can properly define and divide is to be considered a god." Platon. _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From fredrik@pythonware.com Thu Mar 12 12:10:22 1998 From: fredrik@pythonware.com (Fredrik Lundh) Date: Thu, 12 Mar 1998 13:10:22 +0100 Subject: [DOC-SIG] What does this mean for Python? Message-ID: <01bd4daf$d533cfc0$f29b12c2@panik.pythonware.com> >1) SAX driver for xmllib >2) xmlproc uses SAX natively instead of using a driver, although it > will probably need to add some things beyond SAX later > >That gives us well-formedness-checking and a simple standardized >event-based API. > >Building on that I'd planned on making: > >1) A simple ESIS outputter, for demo/testing purposes. >2) A grove builder, eventually with DOM support, although there are > things I dislike about DOM. >3) A validator. > >I also wanted to be able to have groves, validation or both. > What say ye, good people? Ouch, my head hurts. Does anyone have a good reference (website, book, whatever) to recommend that covers all important aspects of XML and stuff like groves, validations, and all the related acronyms? Cheers /F _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From papresco@technologist.com Thu Mar 12 15:31:33 1998 From: papresco@technologist.com (Paul Prescod) Date: Thu, 12 Mar 1998 10:31:33 -0500 Subject: [DOC-SIG] Re: What does this mean for Python? References: <01bd4daf$d533cfc0$f29b12c2@panik.pythonware.com> Message-ID: <3507FFD5.C8190CA4@technologist.com> Okay, let's play acronym expansion. (BTW, with so much XML/SGML activity projected for the next few months, I think we really should have a mailing list or SIG for now, I'll keep doc-sig in the loop as I was instructed the last time we discuss this stuff) Fredrik Lundh wrote: > > >1) SAX driver for xmllib SAX ("Simple API for XML") is an event-driven API for getting information out of SGML documents. http://www.microstar.com/XML/SAX/ It has all of the usual benefits of APIs. You can swap in your favourite (fastest, or most convenient) parser. > >2) xmlproc uses SAX natively instead of using a driver, although it > > will probably need to add some things beyond SAX later xmlproc is Lars' software. When he says he "uses it natively" instead of "through a driver", I think he means that his software is not yet set up to drop in someone else's parser easily. > >That gives us well-formedness-checking and a simple standardized > >event-based API. Well-formedness-checking is simple syntactic checking. SAX is the simple, standradized event-based API. > >Building on that I'd planned on making: > > > >1) A simple ESIS outputter, for demo/testing purposes. ESIS is a simple linearized format for the output of SGML documents where every element starts on a line, attributes are on their own lines and so forth. ESIS is not SGML. It's like a "pickle" of SGML. I would encourage Lars to use a newer XML linearization format: http://www.jclark.com/xml/canonxml.html > >2) A grove builder, eventually with DOM support, although there are > > things I dislike about DOM. A grove is an abstract model for the in-memory representation of SGML documents. The DOM ("Document Object Model") is a world wide web consortium API for accessing the contents of an SGML document. In other words the grove represents the data model and the DOM is a particular API for providing access to it. So where SAX concentrates on generating *events* for stream-based handling of documents, the DOM is an API for explicitly traversing and navigating an in-memory tree. > >3) A validator. A validator reads the declarations in the document type definition and verifies that the document conforms to it. Paul Prescod - http://itrc.uwaterloo.ca/~papresco Our lives shall not be sweated from birth until life closes; Hearts starve as well as bodies; give us bread, but give us roses. - http://www.columbia.edu/~melissa/petronella/songs/bread-roses.html _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Jack.Jansen@cwi.nl Thu Mar 12 16:06:36 1998 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Thu, 12 Mar 1998 17:06:36 +0100 Subject: [DOC-SIG] What does this mean for Python? In-Reply-To: Message by "Fredrik Lundh" , Thu, 12 Mar 1998 13:10:22 +0100 , <01bd4daf$d533cfc0$f29b12c2@panik.pythonware.com> Message-ID: > Ouch, my head hurts. Does anyone have a good reference (website, > book, whatever) to recommend that covers all important aspects of > XML and stuff like groves, validations, and all the related acronyms? I find http://www.sil.org/sgml/xml.html to be a good starting point. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++ http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From larsga@ifi.uio.no Thu Mar 12 16:49:40 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 12 Mar 1998 17:49:40 +0100 Subject: [DOC-SIG] Re: What does this mean for Python? In-Reply-To: <3507FFD5.C8190CA4@technologist.com> References: <01bd4daf$d533cfc0$f29b12c2@panik.pythonware.com> <3507FFD5.C8190CA4@technologist.com> Message-ID: * Paul Prescod | | Okay, let's play acronym expansion. Fredrik: sorry about the headache. I sort of assumed that people were familiar with XML terminology, which was of course a mistake. Thank you for doing the expansion, Paul. :) | xmlproc is Lars' software. That's right. It assumes pretty much the same role as xmllib: parsing a raw XML document and providing hooks for applications that want to do something with the data. | When he says he "uses it natively" instead of "through a driver", I | think he means that his software is not yet set up to drop in | someone else's parser easily. Sorry, what I meant was that it doesn't use a SAX driver, but instead speaks SAX natively, so to speak. It looks like that approach will become cumbersome as xmlproc becomes more complete, so I may have to use another approach later. | I would encourage Lars to use a newer XML linearization format: | | http://www.jclark.com/xml/canonxml.html Thanks for that pointer, Paul! I'll add support for canonical XML output to saxlib since that looks like it can be very useful for testing parsers. | So where SAX concentrates on generating *events* for stream-based | handling of documents, the DOM is an API for explicitly traversing | and navigating an in-memory tree. It's worth noting here that one can build a DOM implementation using the information that comes out of the SAX API so that the DOM library is completely independent of whatever parser is used. This means that if we have a C XML parser and some Python ones that all have SAX drivers the DOM library can use whichever of these happens to be available in each particular installation. Don Park has already made such a DOM implementation on top of SAX in Java, called SAXDOM[1]. I've now made a naive SAX driver for xmllib and added it to my web page[2] together with the ESIS outputter. It's not complete since I don't know how complete xmllib is, but once I add the canonical XML outputter I can test that easily. It's all extremely simple, but should provide a reasonable demonstration of the potential of SAX for now. I will try to improve this to comply more fully with the spec later. With SAX support in both xmllib and my own incomplete xmlproc I was able to do some speed comparisons. For good measure I threw in James Clarks XP[3] parser written in Java (and written to be as fast as possible) and DataChannels DXP[4] Java parser. Here are the results on my 166 MHz Pentium: Time to run hamlet.xml through validation and grove building via SAX: Parser 1st 2nd 3rd Avg xmllib.py 50.1 48.4 49.8 49.4 xmlproc.py 40.8 39.4 39.5 39.9 xp.java 1.49 1.43 1.43 1.45 dxpcl.java 14 - - 14 With no validation or grove building (empty document handler): Parser 1st 2nd 3rd Avg xmllib.py 38.6 37.2 38.7 38.2 xmlproc.py 32.5 33 32 32.5 The numbers speak for themselves, I think. I'll have to read the XP sources closely to see whatever James Clark did to XP to make it that fast. (The comparison between xmllib and xmlproc is not entirely fair since I've still got to add some stuff to xmlproc that will slow it down, but then I haven't tried optimizing it yet either.) [1] [2] [3] [4] -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Jack.Jansen@cwi.nl Fri Mar 13 11:25:10 1998 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 13 Mar 1998 12:25:10 +0100 Subject: [DOC-SIG] Re: What does this mean for Python? In-Reply-To: Message by Lars Marius Garshol , 12 Mar 1998 17:49:40 +0100 , Message-ID: > Time to run hamlet.xml through validation and grove building via SAX: > > Parser 1st 2nd 3rd Avg > xmllib.py 50.1 48.4 49.8 49.4 > xmlproc.py 40.8 39.4 39.5 39.9 > xp.java 1.49 1.43 1.43 1.45 > dxpcl.java 14 - - 14 Lars, could you make your timing scripts available? Just for fun I ran hamlet.xml through James Clark's xmltok (or, actually, through the xmlwf program that comes with it), and it clocks at 0.07 seconds on my SGI O2 (180 Mhz R5000). Since the API appears pretty pythonizeable this appears to be a nice groundwork to start getting better performance for python/xml... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++ http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Sjoerd.Mullender@cwi.nl Fri Mar 13 11:38:56 1998 From: Sjoerd.Mullender@cwi.nl (Sjoerd Mullender) Date: Fri, 13 Mar 1998 12:38:56 +0100 Subject: [DOC-SIG] Re: What does this mean for Python? In-Reply-To: Your message of 12 Mar 1998 17:49:40 +0100. References: <01bd4daf$d533cfc0$f29b12c2@panik.pythonware.com> <3507FFD5.C8190CA4@technologist.com> Message-ID: On Thu, Mar 12 1998 Lars Marius Garshol wrote: > With SAX support in both xmllib and my own incomplete xmlproc I was > able to do some speed comparisons. For good measure I threw in James > Clarks XP[3] parser written in Java (and written to be as fast as > possible) and DataChannels DXP[4] Java parser. > > Here are the results on my 166 MHz > Pentium: > > Time to run hamlet.xml through validation and grove building via SAX: > > Parser 1st 2nd 3rd Avg > xmllib.py 50.1 48.4 49.8 49.4 > xmlproc.py 40.8 39.4 39.5 39.9 > xp.java 1.49 1.43 1.43 1.45 > dxpcl.java 14 - - 14 > > With no validation or grove building (empty document handler): > > Parser 1st 2nd 3rd Avg > xmllib.py 38.6 37.2 38.7 38.2 > xmlproc.py 32.5 33 32 32.5 > > The numbers speak for themselves, I think. I'll have to read the XP > sources closely to see whatever James Clark did to XP to make it that > fast. I have a question about the timings here. How was the data fed to the XML parser in xmllib.py? If you do python xmllib.py hamlet.xml the data is fed to the parser one character at the time. But it is also possible to feed everything at once. There are very significant performance differences between these two methods: If the XML parser sees that a tag is incomplete (usually after parsing the first part of the tag), it saves the data until you feed more data. This means that if you feed the data one character at the time, tags will be parsed partially many times before they are parsed completely, slowing down the process quite a bit. > (The comparison between xmllib and xmlproc is not entirely fair since > I've still got to add some stuff to xmlproc that will slow it down, > but then I haven't tried optimizing it yet either.) I haven't done any optimisations in xmllib either. One obvious optimization is to use regex instead of re (but I am not planning to do that). -- Sjoerd Mullender _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From larsga@ifi.uio.no Fri Mar 13 12:10:05 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: Fri, 13 Mar 1998 13:10:05 +0100 Subject: [DOC-SIG] Re: What does this mean for Python? In-Reply-To: References: <01bd4daf$d533cfc0$f29b12c2@panik.pythonware.com> <3507FFD5.C8190CA4@technologist.com> Message-ID: <3.0.1.32.19980313131005.00688b18@ifi.uio.no> At 12:38 13.03.98 +0100, Sjoerd Mullender wrote: > > I have a question about the timings here. How was the data fed to the > XML parser in xmllib.py? If you do > python xmllib.py hamlet.xml > the data is fed to the parser one character at the time. I think I fed it to the parser in 16K blocks, but I don't actually remember how I did it. Anyway, I will add a timer application to saxlib, so that anyone can do their own speed testing and modify it as they wish. (I hope that satisfies you as well, Jack.) I'll release that tonight (when I get home from work) together with a driver for David Scherers XML-Toolkit (announced on comp.lang.python on Wednesday). Hopefully I'll be able to get xmlproc out some time during the weekend. The really important issue here, I think, is standardizing the parser APIs. We now have Dan Connolys XML scanner/parser, xmllib and David Scherers parser, with at least two more coming up. I'm still waiting for reactions to my SAX proposal. What do you people out there think? Does it look usable? Should we make it the standard Python API or should we scrap it? Or should we modify it? Should we change the method names to be more Python-like? And can it be used with JPython to interoperate with things like Don Parks SAXDOM? All comments/thoughts on this would be very welcome. >I haven't done any optimisations in xmllib either. One obvious >optimization is to use regex instead of re (but I am not planning to >do that). I also use re and don't have any intention of changing, either. Sjoerd, please don't feel threatened by my making my own parser. I did it partly for fun and partly to better understand the interplay between XML entities, well-formedness checking, validation, grove building and what actually goes to the application. So it was not because of dissatisfaction with xmllib, but because I wanted to understand these things better. In fact, when I use xmllib with the SAX canonical XML outputter I seem to get the same results that James Clarks XP gives, so it looks as though xmllib pretty much follows the standard. (I haven't done any rigorous testing, just tested some features I were uncertain about.) I've been telling my colleagues here at STEP Infotek (an SGML firm) about this Python/XML effort and at least two of them (who now use Java and Perl) reacted with "Hmmm... Maybe I should start using Python for my XML work." One of them has even printed out the Python tutorial already. So I think this can be very beneficial for Python if we do it right. And I definitely agree with Sean McGrath: Python is infinitely much better than Perl for this kind of thing. Having a healthy crop of XML parsers and tools written in Python would help make this clear to people. In fact, I now have a list with links to free XML tools and it looks as though I should split the parser section into Java parsers, Python parsers and other parsers. IMHO, that's the kind of thing that would make an impression on people getting into XML and looking for tools. Just my $0.02, of course. --Lars M. _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Sjoerd.Mullender@cwi.nl Fri Mar 13 12:26:15 1998 From: Sjoerd.Mullender@cwi.nl (Sjoerd Mullender) Date: Fri, 13 Mar 1998 13:26:15 +0100 Subject: [DOC-SIG] Re: What does this mean for Python? In-Reply-To: Your message of Fri, 13 Mar 1998 13:10:05 +0100. <3.0.1.32.19980313131005.00688b18@ifi.uio.no> References: <01bd4daf$d533cfc0$f29b12c2@panik.pythonware.com> <3507FFD5.C8190CA4@technologist.com> <3.0.1.32.19980313131005.00688b18@ifi.uio.no> Message-ID: On Fri, Mar 13 1998 Lars Marius Garshol wrote: > > At 12:38 13.03.98 +0100, Sjoerd Mullender wrote: > > > > I have a question about the timings here. How was the data fed to the > > XML parser in xmllib.py? If you do > > python xmllib.py hamlet.xml > > the data is fed to the parser one character at the time. > > I think I fed it to the parser in 16K blocks, but I don't actually > remember how I did it. 16K blocks shouldn't give to much extra overhead because of the reparsing, so the figures should be pretty close to optimal for xmllib. > Sjoerd, please don't feel threatened by my making my own parser. I did it > partly for fun and partly to better understand the interplay between XML > entities, well-formedness checking, validation, grove building and what > actually goes to the application. So it was not because of dissatisfaction > with xmllib, but because I wanted to understand these things better. I don't feel threatened. I was the first to create an XML parser for Python, and nobody can take that away. :-) > In fact, when I use xmllib with the SAX canonical XML outputter I seem to > get the same results that James Clarks XP gives, so it looks as though > xmllib pretty much follows the standard. (I haven't done any rigorous > testing, just tested some features I were uncertain about.) I looked hard at the XML spec when implementing it, so I feel pretty confident that it is reasonably close. I did some more work after 1.5 came out, so my current version is even better (though not necessarily faster). -- Sjoerd Mullender _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From larsga@ifi.uio.no Fri Mar 13 12:36:55 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: Fri, 13 Mar 1998 13:36:55 +0100 Subject: [DOC-SIG] Re: What does this mean for Python? In-Reply-To: References: <01bd4daf$d533cfc0$f29b12c2@panik.pythonware.com> <3507FFD5.C8190CA4@technologist.com> <3.0.1.32.19980313131005.00688b18@ifi.uio.no> Message-ID: <3.0.1.32.19980313133655.0068c52c@ifi.uio.no> At 13:26 13.03.98 +0100, Sjoerd Mullender wrote: > >I don't feel threatened. I was the first to create an XML parser for >Python, and nobody can take that away. :-) True enough. xmllib is also part of the standard distribution, which is another point in your favour. :) >I looked hard at the XML spec when implementing it, so I feel pretty >confident that it is reasonably close. I did some more work after 1.5 >came out, so my current version is even better (though not necessarily >faster). Do you have a URL to it? It would be nice to both have the newest version and to be able to link to xmllib specifically and not just as part of the standard distribution. --Lars M. _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Sjoerd.Mullender@cwi.nl Fri Mar 13 13:02:52 1998 From: Sjoerd.Mullender@cwi.nl (Sjoerd Mullender) Date: Fri, 13 Mar 1998 14:02:52 +0100 Subject: [DOC-SIG] Re: What does this mean for Python? In-Reply-To: Your message of Fri, 13 Mar 1998 13:36:55 +0100. <3.0.1.32.19980313133655.0068c52c@ifi.uio.no> References: <01bd4daf$d533cfc0$f29b12c2@panik.pythonware.com> <3507FFD5.C8190CA4@technologist.com> <3.0.1.32.19980313131005.00688b18@ifi.uio.no> <3.0.1.32.19980313133655.0068c52c@ifi.uio.no> Message-ID: On Fri, Mar 13 1998 Lars Marius Garshol wrote: > At 13:26 13.03.98 +0100, Sjoerd Mullender wrote: > > > >I don't feel threatened. I was the first to create an XML parser for > >Python, and nobody can take that away. :-) > > True enough. xmllib is also part of the standard distribution, which > is another point in your favour. :) But that could be taken away from me. :-) > >I looked hard at the XML spec when implementing it, so I feel pretty > >confident that it is reasonably close. I did some more work after 1.5 > >came out, so my current version is even better (though not necessarily > >faster). > > Do you have a URL to it? It would be nice to both have the newest > version and to be able to link to xmllib specifically and not just as > part of the standard distribution. ftp://ftp.cwi.nl/pub/sjoerd/xmllib.tar.gz http://www.cwi.nl/ftp/sjoerd/xmllib.tar.gz -- Sjoerd Mullender _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Jack.Jansen@cwi.nl Fri Mar 13 13:28:44 1998 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 13 Mar 1998 14:28:44 +0100 Subject: [DOC-SIG] Re: What does this mean for Python? In-Reply-To: Message by Lars Marius Garshol , Fri, 13 Mar 1998 13:10:05 +0100 , <3.0.1.32.19980313131005.00688b18@ifi.uio.no> Message-ID: Okay, I'll put my fingers where my mouth is: I'm creating a Python interface to xmltok. Expect (untested) code later today. -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++ http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Jack.Jansen@cwi.nl Fri Mar 13 13:30:19 1998 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 13 Mar 1998 14:30:19 +0100 Subject: [DOC-SIG] Re: What does this mean for Python? In-Reply-To: Message by Lars Marius Garshol , Fri, 13 Mar 1998 13:10:05 +0100 , <3.0.1.32.19980313131005.00688b18@ifi.uio.no> Message-ID: > I'm still waiting for reactions to my SAX proposal. What do you people > out there think? Does it look usable? Should we make it the standard > Python API or should we scrap it? Or should we modify it? Should we > change the method names to be more Python-like? And can it be used with > JPython to interoperate with things like Don Parks SAXDOM? All > comments/thoughts on this would be very welcome. It looks good to me... -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++ http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Jack.Jansen@cwi.nl Fri Mar 13 14:11:33 1998 From: Jack.Jansen@cwi.nl (Jack Jansen) Date: Fri, 13 Mar 1998 15:11:33 +0100 Subject: [DOC-SIG] Re: What does this mean for Python? In-Reply-To: Message by Jack Jansen , Fri, 13 Mar 1998 14:28:44 +0100 , Message-ID: > Okay, I'll put my fingers where my mouth is: I'm creating a Python interface > to xmltok. Expect (untested) code later today. I've put the module (plus the XMLTok sources it needs, plus a very tiny test program) in ftp://ftp.cwi.nl/pub/jack/pyxmltop.tar.gz . So far the only test I've done is to parse the Hamlet document, which it does in 2 seconds on my O2. I think I've exported all the XMLTok functionality we need, let me know if this isn't so. If someone else is willing to put a SAX wrapper around this: be my guest. I don't really have more time to spend on this at the moment (but I thought that creating this module will at least allow me to be vocal in further discussions:-). -- Jack Jansen | ++++ stop the execution of Mumia Abu-Jamal ++++ Jack.Jansen@cwi.nl | ++++ if you agree copy these lines to your sig ++++ http://www.cwi.nl/~jack | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From larsga@ifi.uio.no Fri Mar 13 14:31:05 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: Fri, 13 Mar 1998 15:31:05 +0100 Subject: [DOC-SIG] Re: What does this mean for Python? In-Reply-To: References: Message-ID: <3.0.1.32.19980313153105.0068c46c@ifi.uio.no> At 15:11 13.03.98 +0100, Jack Jansen wrote: > >I've put the module (plus the XMLTok sources it needs, plus a very tiny test >program) in ftp://ftp.cwi.nl/pub/jack/pyxmltop.tar.gz . So far the only test >I've done is to parse the Hamlet document, which it does in 2 seconds on my >O2. I think I've exported all the XMLTok functionality we need, let me know if >this isn't so. Great! I'll look at it when I get home. >If someone else is willing to put a SAX wrapper around this: be my guest. This is my cue, isn't it? :) I'll do it, if I can manage to compile it and make it work with Python. Actually, I can do it anyway, but I may not be able to test it properly. --Lars M. _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From larsga@ifi.uio.no Fri Mar 13 21:38:14 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 13 Mar 1998 22:38:14 +0100 Subject: [DOC-SIG] More saxlib Message-ID: I've now updated saxlib with - a driver for David Scheres parser - a demo SAX application for parser benchmarking - saxdemo.py can now output canonical XML as well, which is nice for comparing parser output with James Clarks XP for testing. - some other minor changes The web page has also been updated with some links to the different packages that have been announced and a page with benchmark results. The URL is still BTW: I've looked at the xmltok package and from the Python demo it looks nice and simple. However, to be able to make and test a driver I'll have to find out how to load C modules into Python and to be frank I haven't a clue as to how I to do that. So since there is so much else to do I'm putting the SAX driver on hold. If someone else can pick it up, that would be nice. If not I'll get round to it myself eventually. And as I'll have to start using the C module eventually: pointers to clues on how to do it would be appreciated. :) -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From fermigie@math.jussieu.fr Sat Mar 14 11:29:48 1998 From: fermigie@math.jussieu.fr (Stefane Fermigier) Date: Sat, 14 Mar 1998 12:29:48 +0100 Subject: [DOC-SIG] Re: What does this mean for Python? In-Reply-To: ; from Sjoerd Mullender on Fri, Mar 13, 1998 at 02:02:52PM +0100 References: <01bd4daf$d533cfc0$f29b12c2@panik.pythonware.com> <3507FFD5.C8190CA4@technologist.com> <3.0.1.32.19980313131005.00688b18@ifi.uio.no> <3.0.1.32.19980313133655.0068c52c@ifi.uio.no> Message-ID: <19980314122948.43151@riemann.math.jussieu.fr> New snapshot of ``PyDOM'' (DOM-light in python) available at: http://www.math.jussieu.fr/~fermigie/python/PyDOM/ Didn't receive any comments until now. Cheers, S. _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From larsga@ifi.uio.no Sat Mar 14 16:33:48 1998 From: larsga@ifi.uio.no (Lars Marius Garshol) Date: 14 Mar 1998 17:33:48 +0100 Subject: [DOC-SIG] xmlproc Message-ID: I've now put out an early version of xmlproc. There are a couple of obscure bugs and the well-formedness checking that requires parsing the internal DTD subset is not implemented yet. Other than that it is complete and I'm pretty pleased with it so far. Hopefully I'll have full well-formedness checking tomorrow and most of the validation on Monday. (I have the code for all this, but it needs redesigning.) Also: since no-one seems to have any complaints about the SAX interface, perhaps we should start building tools on top of it? Stephane, can you change your DOM package to build on SAX instead of Dan Connolly's parser? That would give the SAX interface a real test and would also mean that your package could be used with all Python parsers. (I've had a glance at your source, but will need to look more closely before I can comment on it.) And should we announce it on the xml-dev list so that the Java people there know about it? I feel a bit uneasy about this, since so far only Jack Jansen has commented on the interface and if it is to be of any use everyone needs to support it. -- "These are, as I began, cumbersome ways / to kill a man. Simpler, direct, and much more neat / is to see that he is living somewhere in the middle / of the twentieth century, and leave him there." -- Edwin Brock http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/ _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From papresco@technologist.com Sun Mar 15 15:42:54 1998 From: papresco@technologist.com (Paul Prescod) Date: Sun, 15 Mar 1998 10:42:54 -0500 Subject: [DOC-SIG] Re: [STRING-SIG] More saxlib References: Message-ID: <350BF6FE.6F2880F0@technologist.com> Lars Marius Garshol wrote: > The URL is still > I've looked at SaxLib and it seems like a very straightforward (and thus elegant) transliteration of the Java version. Your extensions seem quite appropriate. I propose that we "bless" this as the "standard" Python sax interface and call it Sax.py version 0.9. As we get more experience and are confident there are no bugs, we can promote it to version 1.0. > And as I'll have to start using the C module eventually: pointers > to clues on how to do it would be appreciated. :) It really wasn't hard. I was worried that it would be painful the first time I did it, but it turns out to be trivial, because of the great design of the build system. Just look in the "modules" directory. Essentially you copy the module in, add a line to the setup file and recompile! Paul Prescod - http://itrc.uwaterloo.ca/~papresco "You have the wrong number." "Eh? Isn't that the Odeon?" "No, this is the Great Theater of Life. Admission is free, but the taxation is mortal. You come when you can, and leave when you must. The show is continuous. Good-night." -- Robertson Davies, "The Cunning Man" _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From akuchlin@CNRI.Reston.VA.US Mon Mar 16 18:03:08 1998 From: akuchlin@CNRI.Reston.VA.US (Andrew Kuchling) Date: Mon, 16 Mar 1998 13:03:08 -0500 (EST) Subject: [DOC-SIG] XML SIG created Message-ID: <199803161803.NAA29699@newcnri.cnri.reston.va.us> An XML-SIG has been created for discussing XML and Python, and for developing a set of Python tools for processing XML documents. See below for the mission statement. To join the discussion, you can use the Mailman Web interface at . Alternatively, you can send an email with the word "subscribe" in either the subject line or the body of the message, to xml-sig-request@python.org . This SIG was inspired by recent XML discussions on the Doc-SIG and String-SIG; coincidentally, a few people have posted XML-related messages to the newsgroup. There seems to be a lot of community interest in this topic, and development will probably be fast; prototype implementations of the Document Object Model (DOM) and Simple API for XML (SAX) have appeared with startling speed. ================= XML-SIG: A Special Interest Group for XML Processing in Python This list has been created to provide a forum for discussion and implementation of tools to make Python an excellent choice for XML processing. XML is the 'Extensible Markup Language', a data format for structured document interchange. It seems to have considerable momentum behind it, and will probably become very important over the next few years. Consult http://www.w3.org/XML/ for more information. With appropriate software packages, documentation, and a bit of publicity, Python could become the premier language for XML processing. The goal of this SIG is to decide what software is required for this purpose, and coordinate its implementation and documentation. Concrete goals of the first mandate should include: * The glue code required to use existing C/Java XML parsers from CPython/JPython, and the appropriate documentation. * An XML-HOWTO which provides an overview of Python and XML processing. * Implementations of the DOM (Document Object Model) and SAX (Simple API for XML) * As work progresses, the SIG community may realize that other deliverables are required, and add them to the SIG's goals. Wrap-up date: September 1998. This date may be extended if the SIG community decides that more work is required. A.M. Kuchling http://starship.skyport.net/crew/amk/ Whatever you do, stamp out abuses, and love those who love you. -- Voltaire _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From tratt@dcs.kcl.ac.uk Tue Mar 17 18:17:03 1998 From: tratt@dcs.kcl.ac.uk (Laurence Tratt) Date: Tue, 17 Mar 1998 18:17:03 GMT Subject: [DOC-SIG] Python Library Reference in new HTML form Message-ID: I've (hopefully) just about completed a project for my platform to conver the Python Library Reference and Tutorial to a format specific to the platform. However I realised I'd left the code openended enough for me to start an HTML version, which I did on Saturday; I think it's now complete. I say "I think", because I haven't really had time (and certainly not the inclination) to check all the pages that it generates for 100% correctness; having said that, I'm fairly confident that the conversion has been, at worst, largely successful. If you want to get a download URL and skip lots of boring prose, there's a URL at the middle of the message :) Due to the original intention of my project, the pages generated are ever so slightly different from LaTeX2HTML (this is an understatement). For a start, I rip everything up into subpages (so if you want to should be able to do something like "netscape string/index.html" to view a particular page. The reason I've done this is because I can now press F1 on a function in my text editor, and get whisked straight away to the appropriate page) and stick things in directories. There are an awful lot of directories, but it does mean you can put the mouse over a link and not get a horrendous "node198374836495763452" type number, which I've found especially useful for some links! Secondly, my front page is, intentionally, nothing like the page for the current Library Reference. It's all in a big table in essentially alphabetical order; I found that finding things with the old page was generally quicker if I loaded the page into my text editor, found the position of what I was looking for and then searched in my web browser... Thirdly, there are a quite lot of inter-manual links; if one page refers to another then generally (working out foolproof rules is not trivial) there'll be a nice HTML link so you can click and go there. This has got to a big boon for novice users and I'm sure even some of you expert users will find it useful. It also tries very hard to get web and e-mail links working properly, so you can click on them as per normal. Fourthly, the tutorial is bunged in there as well because I know I still find it useful for looking up syntax bits when I get confused :) Also, for novice users they probably don't want that differentiation between tutorial and library reference because it's a pain having lots of web browser windows up (especially if you're unlucky enough to be using "window at the front" Windows), so it seemed like a good idea. It also allows there to be links from some things in the tutorial to the PLR, another novice boon. Lastly there are two indices included; one is a traditional "book" (ish) index, and the other is a coders "method/data" type index. Both of them have their uses to my way of thinking, so they're both included. They're also split up into alphabetically named files, so it's not one monolithic index like LaTeX2HTML generates. At the moment, I'm not convinced that there is actually any use for this product. I believe somebody is working on SGML versions of the PLR so presumably there'll be *another* HTML version of the manuals coming along sometime soon**; it's in a rather different format to what people are used to, and computer people (and I speak from experience here) are often reluctant to change to something different...; I wouldn't be entirely sure the conversion process has been 100% accurate so there may be some goofs in there. However, I like it quite a lot over the original and find it aids productivity, so I'm releasing a sort of test version to the doc-sig to see if there's actually any use for this. If there isn't, I'll wrap it up and it won't budge off my machine; if people do think it's useful, and perhaps it'll be some time before another lot of HTML documentation will come along, then I will consider releasing it to a wider community if I think it'll stand up to it. So, here's the URL to download the Python Library Reference (and tutorial) in a different HTML form: http://yi.com/home/TrattLaurence/comp/python/man.html WARNINGS: 1) There's a .tar.gz file to download, it contains lots (nearly 2300 files if I'm being honest) files with 150ish directories at the top level which contain the HTML, so you probably want to decompress this in a directory on its own. If you're wondering, you do get meaningful filenames for this :) 2) The filenames contain potentially "funny" characters (noticeabley "!, ;" and hard spaces), so if your OS doesn't like them, tell me so the next version of the manual won't have them in. I *think* all the filenames are OK on Windows machines, so I'm guessing they're probably alright on UNIX. They don't cause me any problems, but my filing system is a little unusual 3) I'm not promising any support though if, as I've said, there's enough interest I would like to develop this to an appropriate level The conversion is almost 100% automatic (I generated the index file and needed to tidy up the tutorials index (I'd never intended that such files should work with my project, so was quite happy when with only minimal mucking about the tutorial worked) but apart from that, this is a genuine build from "this morning"), very slow but entirely in Python. I will release the source in the near future (hopefully within a month), but at the moment, a couple of things are a bit bonkers because I'm running an early beta of Python 1.5 to my machine, and one or two things like package support are not functioning correctly yet. Once I've got a platform independent version ready (and if there's any interest in the project as a whole) and got some documentation done, I'll put the source up for download. If it's of any interest (and as a sort of disclaimer), I don't know LaTeX so the process is based on purely imperical observations; this may have been an advantage :) At the moment the manual is, if I remember correctly, based on the 1.5b2 documentation which should be almost entirely up to date. However, could anyone point me to a downloadable directory with the very latest PLR (and tutorial?) in LaTeX format? The HTML on www.python.org had a date of February last time I looked, but I can't find LaTeX of that date on ftp. python.org :( Please don't hesitate to contact me if you find this intersting / useless or if I've trodden on anyones toes. Please don't distribute the manual outside of the doc-sig; it's not in my opinion ready yet, and certainly is not before some Python experts have looked at it. e-mail to: tratt@dcs.kcl. ac.uk Hope this is of use to someone and sorry for this huge message, Laurie ** If anyones intersted, it's very easy to make outputters for other formats, so if you want the PLR in format xyz, give me a shout. SGML should be relatively easy if anyone wants that, though presumably somebody else is already on that case _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Fred L. Drake, Jr." References: Message-ID: <13582.49086.416363.145690@weyr.cnri.reston.va.us> Laurence Tratt writes: > I've (hopefully) just about completed a project for my platform to conver > the Python Library Reference and Tutorial to a format specific to the > platform. However I realised I'd left the code openended enough for me to Laurence, I don't see where you mention what platform / format you were originally targetting. Could you tell us about it? > Due to the original intention of my project, the pages generated are ever > so slightly different from LaTeX2HTML (this is an understatement). For a I will definately be taking a look at your version, but I'm a little swamped right now. I'll stash a copy aside and take sneak peeks. ;-) > it does mean you can put the mouse over a link and not get a horrendous > "node198374836495763452" type number, which I've found especially useful The next release of the Python documentation that I'm coordinating will be "bookmarkable" and avoid most of that nastiness. > Secondly, my front page is, intentionally, nothing like the page for the > current Library Reference. It's all in a big table in essentially Were you aware of the module index in the last release? A single page with a list of modules alphabetically sorted. > Thirdly, there are a quite lot of inter-manual links; if one page refers to > another then generally (working out foolproof rules is not trivial) I have mixed feelings about extensive links, but a lot of that has to do with the aweful presentation of web browsers. CSS can help, but there does't appear to be a common understanding among the browsers as to what "getting it right" means. Oh, the font problems on Netscape/UNIX drive me bonkers! Future releases will improve the semantic content of the markup, making it a little easier to create links automatically. > find it useful. It also tries very hard to get web and e-mail links working > properly, so you can click on them as per normal. The version I'll be releasing shortly makes URLs "hot", but leaves email addresses alone (though they are marked and could easily be turned into hyperlinks). I do this to avoid swamping people with mail. > Lastly there are two indices included; one is a traditional "book" (ish) > index, and the other is a coders "method/data" type index. Both of them > have their uses to my way of thinking, so they're both included. They're > also split up into alphabetically named files, so it's not one monolithic Good idea. And I've been wrestling with LaTeX2HTML indexing a lot lately. ;-( > At the moment, I'm not convinced that there is actually any use for this > product. I believe somebody is working on SGML versions of the PLR so I guess my name is "somebody", then. ;-) I've been working on them a bit, but I've been spending a fair amount of time lately on Q/A for the LaTeX documents. My expectation is that I'll be able to generate more usable SGML from them. The intention of the SGML conversion project is to move toward SGML for the official documentation sources. There's still a lot to do, though, mostly due to my schedule constraints. > presumably there'll be *another* HTML version of the manuals coming along > sometime soon**; it's in a rather different format to what people are used > to, and computer people (and I speak from experience here) are often > reluctant to change to something different...; I wouldn't be entirely sure > the conversion process has been 100% accurate so there may be some goofs in > there. However, I like it quite a lot over the original and find it aids > productivity, so I'm releasing a sort of test version to the doc-sig to see > if there's actually any use for this. If there isn't, I'll wrap it up and > it won't budge off my machine; if people do think it's useful, and perhaps > it'll be some time before another lot of HTML documentation will come > along, then I will consider releasing it to a wider community if I think > 1) There's a .tar.gz file to download, it contains lots (nearly 2300 > files if I'm being honest) files with 150ish directories at the top level That is a lot of files.... > which contain the HTML, so you probably want to decompress this in a > directory on its own. If you're wondering, you do get meaningful filenames See my note above; the next version I release will have usable file names that should survive across releases. > build from "this morning"), very slow but entirely in Python. I will > release the source in the near future (hopefully within a month), but at I look forward to this. I hope you had the good sense to ignore partparse.py from the current distribution! ;-) (The LaTeX scanning isn't really that bad, but the code is unmaintainable!) > anyone point me to a downloadable directory with the very latest PLR (and > tutorial?) in LaTeX format? The HTML on www.python.org had a date of I plan to make available a documentation release in HTML, LaTeX, PDF, and PostScript next week. > Hope this is of use to someone and sorry for this huge message, Thanks for the efforts! And please don't apologize for reporting on your work! That's what this SIG is for! -Fred -- Fred L. Drake, Jr. fdrake@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive Reston, VA 20191 _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From da@skivs.ski.org Tue Mar 17 18:58:15 1998 From: da@skivs.ski.org (David Ascher) Date: Tue, 17 Mar 1998 10:58:15 -0800 (PST) Subject: [DOC-SIG] Python Library Reference in new HTML form In-Reply-To: Message-ID: First impressions are very good. I have a few quibbles, which are only intended to help make a good product very good. 1) On my screen, I had to make the netscape window very big to see the four columns of the index.html page. I'm not sure what the right solution to that is. 2) It's too bad that when looking at a given function, I have to look at the "Parent" link which is in a small font to find out what module it's in. Is the Parent link always the containing module? Would it be possible to make that relationship a bit more salient? 3) Since you seem able to generate a lot of indices, I do think that a lot could be done there: - a Table of Modules - Alphabetical [sort of like your traditional index but without the contents of the module, just the links] - a Table of Platform-Specific Modules, Alphabetical by platform, with redundancy for 'os': - Mac - Unix * posix * os - SGI - Sun - Windows 4) Filename problems: on Windows NT SP3, using gnuwin's tar or WinZip, the directories with spaces in them get written with a \240 as the space character. At least some HTML links have spaces in them. In other words, I can't get to the "Built-in Exceptions" or "Python Tutorial" pages. I suggest using _ as the space-replacing character, both on disk and in HTML =). 5) Some slight mistakes in linking: UserList/index.html, the first link to UserDict points to the class, whereas it should point to the module. 6) Wacky idea, and not for the novice, but still potentially useful: allow hyperlinks to the source of the modules. It might even be a nice way to allow semi-novices to learn more about Python. Something to keep on the back burner. 7) Typesetting comments: - overall, I like the clean look. - I don't think the christmas-tree table of contents for the tutorial is very readable -- I'd make it look more like Python code =): Whetting Your Appetite: Disclaimer Introduction Where from Here Using the Python Interpreter: Invoking the Interpreter The Interpreter and its Environment etc. 8) Pages which are within the tutorial should have backlinks to the TOC of the tutorial. Cheers, --da _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Edward Welbourne Tue Mar 17 20:16:45 1998 From: Edward Welbourne (Edward Welbourne) Date: Tue, 17 Mar 1998 20:16:45 GMT Subject: [DOC-SIG] Python Library Reference in new HTML form In-Reply-To: References:

Message-ID: <9803172016.AA17978@lslr6g.lsl.co.uk> > 1) On my screen, I had to make the netscape window very big to see the > four columns of the index.html page. I'm not sure what the right > solution to that is. Ah yes. This is what the DIR list in HTML is for. Pity none of the browser authors have cared to honour it. With tables you have to say how many columns you want: but DIR could sensibly be understood to allow the user agent to chose how many columns to have, based on available space &c., rather as ls does. Sadly, knowing that The Right solution is DIR still doesn't help find a right solution (without capitals). Eddy. _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From tratt@dcs.kcl.ac.uk Wed Mar 18 09:38:08 1998 From: tratt@dcs.kcl.ac.uk (Laurence Tratt) Date: Wed, 18 Mar 1998 09:38:08 +0000 Subject: [DOC-SIG] Python Library Reference in new HTML form Message-ID: <199803180938.JAA26459@helium.dcs.kcl.ac.uk> >> I've (hopefully) just about completed a project for my platform to conver >> the Python Library Reference and Tutorial to a format specific to the >> platform. >I don't see where you mention what platform / format you were > originally targetting. Could you tell us about it? Um, it's Acorn RISC OS, and if you've actually heard of it, I'd be very impressed :) The format I converted it to is vaguely similar to HTML, but has several advantages in that it's searchable (and the searching is massively fast. It's virtually instantaneous on 2Mb of data, and it has to load it all from my not too fast hard disk), the program that displays it keeps windows nice and small so you can have them at the side of the screen whilst you're programming. The page splitting thing came from the fact that this was the way I could click "F1" on a function in my text editor, and get help on it straight away. > I will definately be taking a look at your version, but I'm a little > swamped right now. I'll stash a copy aside and take sneak peeks. ;-) I did guess that even if this is of no obvious help, it might give the next HTML generation of documentation some good ideas. > Were you aware of the module index in the last release? A single > page with a list of modules alphabetically sorted. Yes, but I find it's annoying to have to wade through things to get to what I want when really I want a front page where I can click on what I want straight away. Perhaps the order of the current manuals front page is more useful for total novices, but after that I think that most people probably find it a pain unless they know their way around it very well. It seems distinctly un-user-friendly, and slow. This is just my opinion, I could be wrong here :) > I have mixed feelings about extensive links, but a lot of that has > to do with the aweful presentation of web browsers. Well, I've kept to a fairly small subset of HTML. There's tables in there (obviously), but that's as difficult as it gets. There's no font size commands or anything else that makes things look good on one browser and awful on another. I only tested it out on Acorn Browse and ANT Fresco (no, you probably haven't heard of either of them), but I'll try Netscape later today. Hopefully, things will look OK. > CSS can help, but there does't appear to be a common understanding > among the browsers as to what "getting it right" means. Oh, the > font problems on Netscape/UNIX drive me bonkers! I think it is easy to get embroiled in platform specific stuff like this. I'm lucky; RISC OS has had a very good antianialised font manager in since 1988, so there's no problems. I kept things simple because I know that as soon as people start marking around with font size and font faces on most OSs, things start breaking down (Windows is horrendous with text that isn't relatively large, for example, because it doesn't antianialise text). > Future releases will improve the semantic content of the markup, > making it a little easier to create links automatically. Hmmm, you should see the regular expression code I've got to do links. It's not very long, perhaps 3 or 4Kb, but already unmaintainable :) > The version I'll be releasing shortly makes URLs "hot", but leaves > email addresses alone (though they are marked and could easily be > turned into hyperlinks). I do this to avoid swamping people with > mail. Fair point. What do other people think of this? > Good idea. And I've been wrestling with LaTeX2HTML indexing a lot > lately. ;-( Never run it. I had to look at it before I could guess what {\rm} meant :) > I guess my name is "somebody", then. ;-) I've been working on them > a bit, but I've been spending a fair amount of time lately on Q/A for > the LaTeX documents. My expectation is that I'll be able to generate > more usable SGML from them. The intention of the SGML conversion > project is to move toward SGML for the official documentation sources. > There's still a lot to do, though, mostly due to my schedule > constraints. Could you explain this a little more? I'm guessing that at the moment you're updating the LaTeX docs and once you've got those stable you'll convert them all to SGML, and use those as the base documentation? Sounds very sensible to me. LaTeX is not particularly easy to parse, I've found. >> 1) There's a .tar.gz file to download, it contains lots (nearly 2300 >> files if I'm being honest) files with 150ish directories at the >> top level > That is a lot of files.... It's an unusual system :) >> build from "this morning"), very slow but entirely in Python. I will >> release the source in the near future (hopefully within a month), >> but at > I look forward to this. I hope you had the good sense to ignore > partparse.py from the current distribution! ;-) (The LaTeX scanning > isn't really that bad, but the code is unmaintainable!) I'm afraid the code is about 99.8% based on me looking at LaTeX, guessing what the output would look like if I'd invented the language, then going about doing it. I had to look in the LaTeX2HTML source to guess what {\rm} could be, and every so often I referred to the original HTML documentation to check to see if something that I thought was wrong was intentional or not... This is probably not good programming practice :) Everything is a little fragile, because the parser was only ever intended to work with the PLR. > I plan to make available a documentation release in HTML, LaTeX, > PDF, and PostScript next week. Good. How much has been updated since the last release? Laurie _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From tratt@dcs.kcl.ac.uk Wed Mar 18 09:56:18 1998 From: tratt@dcs.kcl.ac.uk (Laurence Tratt) Date: Wed, 18 Mar 1998 09:56:18 +0000 Subject: [DOC-SIG] Python Library Reference in new HTML form Message-ID: <199803180956.JAA28482@helium.dcs.kcl.ac.uk> > 1) On my screen, I had to make the netscape window very big to see the > four columns of the index.html page. I'm not sure what the right > solution to that is. Yes, I did think of that just after I'd released it :) The problem is, there's no real way to put things in a table (which is good, because you want to fit as much stuff on screen as possible), whilst making sure it doesn't place unreasonable limitations on the user. Would putting it down to 4 or 3 columns be better? > 2) It's too bad that when looking at a given function, I have to look at > the "Parent" link which is in a small font to find out what module it's in. > Is the Parent link always the containing module? Would it be possible > to make that relationship a bit more salient? Hmm... On my browser, H5 is relatively large. This is a problem; this sort of things is *so* browser dependent, I'm not too sure what the best things to do is. Perhaps H4? > 3) Since you seem able to generate a lot of indices, I do think that > a lot could be done there: > - a Table of Modules - Alphabetical [sort of like your traditional > index but without the contents of the module, just the links] > - a Table of Platform-Specific Modules, Alphabetical by platform, with > redundancy for 'os': > > - Mac > - Unix > * posix > * os > > - SGI > - Sun > - Windows I'm not too sure if this is possible. There is the fact that you can access, say, "Macintosh Specific Services" from the front page; I'll see if I can get the code to do a reliable conversion of such an index at the weekend but it might not be as easy as it seems. > 4) Filename problems: on Windows NT SP3, using gnuwin's tar or WinZip, the > directories with spaces in them get written with a \240 as the space > character. At least some HTML links have spaces in them. In other > words, I can't get to the "Built-in Exceptions" or "Python > Tutorial" pages. I suggest using _ as the space-replacing character, > both on disk and in HTML =). Ah, yes, I did think there might be some problems here. Windows '95 (and presumably NT 4) manages OK here but I guess that's because they use that awful DOS name mangling thing with tildas flying around everywhere. I'm not averse to using "_" although I think hard spaces are a little easier on the eye if your OS supports them. Perhaps two different versions, one with hard spaces and one with "_" should be included, or would people prefer just a dashed version? > 5) Some slight mistakes in linking: > > UserList/index.html, the first link to UserDict points to the class, > whereas it should point to the module. This is, I think, one of the few (UserDict and UserLib are basically the same page) places where this happens. The only solution is to change these links after the conversion process. If it's any consolation, I'm not aware of any other mistakes in linking, although there are obviously lots of places where there's no linking where one might think there ought to be. cf: w = re.compile(r"woo") m = w.match("woo woo") re.compile will get matched, but w.match won't. If anyone wants full Python parser support, they're quite welcome to write it, I may add ;) > 6) Wacky idea, and not for the novice, but still potentially useful: allow > hyperlinks to the source of the modules. It might even be a nice way > to allow semi-novices to learn more about Python. Something to keep on > the back burner. But I can't be sure where they will be stored on the users computer :( When I release the source, it should be easy to quickly alter it to do this sort of individual-dependent thing if you so wish, but I can't see any flexible way to do it. If anyone can, it is a good idea. > 7) Typesetting comments: > - overall, I like the clean look. > - I don't think the christmas-tree table of contents for the tutorial > is very readable -- I'd make it look more like Python code =): Fair enough :) This is a good point, I wasn't happy with the current look of the tutorial index, and you have come up with a better way of doing things. I'll change it at the weekend. > 8) Pages which are within the tutorial should have backlinks to the TOC of > the tutorial. I think they already do, don't they? Or do you mean if you click on a link which takes you into the PLR, the link now points to the parent of the function? Ah, the catch is that "parent" is not analgous to "back" or "previous" in this manual :) Parent is literally the parent of the function/data whatever, not necessarily the page you have just come from. Thanks for the help, Laurie _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Sjoerd.Mullender@cwi.nl Wed Mar 18 12:49:21 1998 From: Sjoerd.Mullender@cwi.nl (Sjoerd Mullender) Date: Wed, 18 Mar 1998 13:49:21 +0100 Subject: [DOC-SIG] Python Library Reference in new HTML form In-Reply-To: Your message of Wed, 18 Mar 1998 09:56:18 +0000. <199803180956.JAA28482@helium.dcs.kcl.ac.uk> References: <199803180956.JAA28482@helium.dcs.kcl.ac.uk> Message-ID: On the whole, I must say, it looks good. On Wed, Mar 18 1998 "Laurence Tratt" wrote: > > 5) Some slight mistakes in linking: > > > > UserList/index.html, the first link to UserDict points to the class, > > whereas it should point to the module. > > This is, I think, one of the few (UserDict and UserLib are basically > the same page) places where this happens. The only solution is to There are also linking errors in Generic�Operating�System�Services/index.html. I also found errors in the xmllib pages. There are references to end_ and start_, but in the LaTeX version they are end_\var{tag} and start_\var{tag}. -- Sjoerd Mullender _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From tratt@dcs.kcl.ac.uk Wed Mar 18 13:57:07 1998 From: tratt@dcs.kcl.ac.uk (Laurence Tratt) Date: Wed, 18 Mar 1998 13:57:07 +0000 Subject: [DOC-SIG] Python Library Reference in new HTML form Message-ID: <199803181357.NAA10332@helium.dcs.kcl.ac.uk> >> Um, it's Acorn RISC OS, and if you've actually heard of it, I'd be >> very impressed :) > Sure has. Never seen one in real life, though (IIRC, the Acorn > web browser is about the only one that can handle all flavours of > the PNG image format, including all kinds of transparency). Coo. I'm impressed :) And yes, I think you're right about the Acorn browser. It's still in beta at the moment, but it's coming along nicely. Andrew Hodgkinson and co. seem to be going pretty well on it at the moment. It is, in RISC OS terms, pretty memory hungry though, taking up nearly a megabyte even when it's not doing a great deal. Though that's insignificant when compared to, say, IE or Netscape :) >> I did guess that even if this is of no obvious help, it might give >> the next HTML generation of documentation some good ideas. > FWIW, it looks like it would be very little work to adapt your stuff > to Microsoft's new HTML Help utility (a nifty little utility which packs > tons of HTML pages into a single archive, adds advanced indexing and > free-text search facilities, etc). Might give it a try later this week. Feel free. If there's any changes that need to be done (and I can do them), don't hesitate to ask. Once I've got a couple of things de-platform-specificed (is that a word?) from the source so to speak, I'll probably go ahead and release it. It's not too large. The whole thing, including the HTML outputter and the StrongHelp outputter (which is the original one for Acorn machines) is just over 90Kb and it clocks in at around 3000 lines. It's not a particuarly good example of exemplary programming practice, but that's because I thought initially I'd only do a small crude conversion, when what I wound up with was a very big, crude conversion. It's not so large that people shouldn't be able to find their way around it, if they need to. >> Windows is horrendous with text that isn't relatively large, for example, >> because it doesn't antianialise text). > Footnote: it does, if you have the Plus! extensions and a 15-bit or better > screen (the anti-aliaser can also be grabbed from microsoft.com). But I > usually use the free Verdana font (designed for the screen by one of the > world's leading typeface designers) -- it looks excellent at almost any size. The last time I saw Plus, it could only anti-alias text that was *larger* than a certain size which rather defeats the point :) That might, however, have been an example of a dodgy machine setup. Believe me, once you're used to antianialised text everywhere (except in text editors... it looks out of place with monospaced fonts) it's hard to go over to non-anti anialised. Laurie _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Fred L. Drake, Jr." References: <199803180938.JAA26459@helium.dcs.kcl.ac.uk> Message-ID: <13583.54362.476991.511056@weyr.cnri.reston.va.us> Laurence Tratt writes: > Um, it's Acorn RISC OS, and if you've actually heard of it, I'd be I have, but I've never had a chance to use it. > Yes, but I find it's annoying to have to wade through things to get > to what I want when really I want a front page where I can click on Editor-based help can be a big plus, so it makes a lot of sense to have a new conversion to do that. With the coming release, the Module Index is directly addressable by name, no "node####.html" cruft! That makes it bookmarkable, so eliminates a least having to wade through the Table of Contents. It's also a lot shorter, so it loads faster! > > I have mixed feelings about extensive links, but a lot of that has > > to do with the aweful presentation of web browsers. > > Well, I've kept to a fairly small subset of HTML. There's tables in > there (obviously), but that's as difficult as it gets. There's no I wasn't thinking so much of over-use of presentation markup as that links are always represented the same way (color change, maybe an underline...); having a bunch of colored links in the text makes it hard to read. I'd love to be able to have "implied" links to modules, functions, methods, etc., that didn't change color, but that were hot none-the-less. Not hard to do with CSS, but that's not supported in enough browsers yet to rely on it. > Hmmm, you should see the regular expression code I've got to do > links. It's not very long, perhaps 3 or 4Kb, but already Hm.. perhaps I shouldn't see it! > > more usable SGML from them. The intention of the SGML conversion > > project is to move toward SGML for the official documentation sources. ... > Could you explain this a little more? I'm guessing that at the moment > you're updating the LaTeX docs and once you've got those stable > you'll convert them all to SGML, and use those as the base That's correct. > Good. How much has been updated since the last release? From the standpoint of the markup, quite a bit. Way more if you're still relying on 1.5b2. There's only a limited change in the content. Most of the markup changes are to take advantage of the newer markup I've defined, shift the whole mess to a more modern LaTeX (version 2e instead of 2.09), and improve the general consistency of the markup. The drive behind releasing it at this point it that I've promised it to the people who wanted a LaTeX source release. Most of those requests really just want it on A4 paper. (Maybe I should just provide a PostScript version formatted for A4?) -Fred -- Fred L. Drake, Jr. fdrake@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive Reston, VA 20191 _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From tratt@dcs.kcl.ac.uk Wed Mar 18 14:27:04 1998 From: tratt@dcs.kcl.ac.uk (Laurence Tratt) Date: Wed, 18 Mar 1998 14:27:04 +0000 Subject: [DOC-SIG] Python Library Reference in new HTML form Message-ID: <199803181427.OAA15445@helium.dcs.kcl.ac.uk> > Editor-based help can be a big plus, so it makes a lot of sense to > have a new conversion to do that. > With the coming release, the Module Index is directly addressable by > name, no "node####.html" cruft! That makes it bookmarkable, so > eliminates a least having to wade through the Table of Contents. It's > also a lot shorter, so it loads faster! That will be better. As far as I can see, though, the only way to get editor help is to do the splitting up of files that I've done. Unless each module has one monolithic page you can't really do #### sort of stuff, but having everything on one page would be horrendous. >> Hmmm, you should see the regular expression code I've got to do >> links. It's not very long, perhaps 3 or 4Kb, but already > Hm.. perhaps I shouldn't see it! Well, it works and it would be a lot worse if I parsed it manually :) > Most of the markup changes are to take advantage of the > newer markup I've defined, shift the whole mess to a more modern LaTeX > (version 2e instead of 2.09), and improve the general consistency of > the markup. Eek. I bet it'll break my entire converter :) Is it a big markup change, or is it largely the same as before? > The drive behind releasing it at this point it that I've promised it > to the people who wanted a LaTeX source release. Most of those > requests really just want it on A4 paper. (Maybe I should just > provide a PostScript version formatted for A4?) Personally I can't imagine printing out this sort of stuff on A4 but that's because I could never justify the expense and I find it as easy to look at it on screen. Surely postscript would be easier for most people, but you'll always get a tiny minority who want it on A4.6 or some bizarre proprietary size, triple sided or something else which means they'll want the source and not the "binary". Laurie _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Fred L. Drake, Jr." References: <199803181420.OAA15169@helium.dcs.kcl.ac.uk> Message-ID: <13583.56175.287434.422989@weyr.cnri.reston.va.us> Laurence Tratt writes: > But Strong is generally just Bold... The concern was over font size. > I tend to think of font size as something of last resort... I'm Yes; the intent of my solution was to avoid dealing with the font size at all. ;-) > It is quite confusing... There's anydbm and dumbdm, but I don't think > there's any others... Quite a few lib*.tex files document more than > one module within them, but that doesn't seem like a problem to me, The number of modules per .tex file should be entirely irrelevant, but those which describe more than one, with the UserDict/UserList and anydbm/dumbdbm exceptions, still use one \section{} for each module. That's where the problem creeps in. > I don't think I'd include it by default... It would be intimidating > to novice users and a great deal of other users. Python does not need > to be seen by the outside world to be "just another difficult to > learn, technically minded language". But it could be an option if Sounds good. -Fred -- Fred L. Drake, Jr. fdrake@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive Reston, VA 20191 _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Fred L. Drake, Jr." References: <199803181427.OAA15445@helium.dcs.kcl.ac.uk> Message-ID: <13583.56494.341567.325853@weyr.cnri.reston.va.us> Laurence Tratt writes: > Eek. I bet it'll break my entire converter :) Is it a big markup > change, or is it largely the same as before? Probably, but it's not likely to be significant. Mostly, a bunch of things that were marked \code{...} are marked with other names. Some of the index things have been changed (mostly new ones added), and a couple of new environments have been defined. "\," has been eveicted, mostly to get LaTeX2HTML to get things right. > most people, but you'll always get a tiny minority who want it on > A4.6 or some bizarre proprietary size, triple sided or something else Oh, I want some of that triple-sided paper! Think mt employer will spring for the fancy printers? ;-) -Fred -- Fred L. Drake, Jr. fdrake@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive Reston, VA 20191 _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From tratt@dcs.kcl.ac.uk Wed Mar 18 14:56:53 1998 From: tratt@dcs.kcl.ac.uk (Laurence Tratt) Date: Wed, 18 Mar 1998 14:56:53 +0000 Subject: [DOC-SIG] Python Library Reference in new HTML form Message-ID: <199803181456.OAA19991@helium.dcs.kcl.ac.uk> >> But Strong is generally just Bold... The concern was over font size. >> I tend to think of font size as something of last resort... I'm > Yes; the intent of my solution was to avoid dealing with the font > size at all. ;-) Oh, OK :) The original point was that the font size wasn't big enough... Perhaps it will come down to font size += ickiness. > The number of modules per .tex file should be entirely irrelevant, > but those which describe more than one, with the UserDict/UserList and > anydbm/dumbdbm exceptions, still use one \section{} for each module. > That's where the problem creeps in. Agreed. I'd have to get rid of my code that copes with that at the moment though ;) Laurie _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From tratt@dcs.kcl.ac.uk Wed Mar 18 14:59:32 1998 From: tratt@dcs.kcl.ac.uk (Laurence Tratt) Date: Wed, 18 Mar 1998 14:59:32 +0000 Subject: [DOC-SIG] Python Library Reference in new HTML form Message-ID: <199803181459.OAA20124@helium.dcs.kcl.ac.uk> >> Eek. I bet it'll break my entire converter :) Is it a big markup >> change, or is it largely the same as before? > Probably, but it's not likely to be significant. Mostly, a bunch of > things that were marked \code{...} are marked with other names. Some > of the index things have been changed (mostly new ones added), and a > couple of new environments have been defined. "\," has been eveicted, > mostly to get LaTeX2HTML to get things right. What sort of things that were \code aren't \code any more? I have to admit I couldn't discern any obvious rules for \ and such like, so some rationalisation does seem sensible :) Any added index commands is almost certainly a good thing. I think there's a lot of scope for adding lots of things to the index. >> most people, but you'll always get a tiny minority who want it on >> A4.6 or some bizarre proprietary size, triple sided or something else > Oh, I want some of that triple-sided paper! Think mt employer will > spring for the fancy printers? ;-) It's Microsoft triple-sided paper. Guaranteed, and very user-friendly. Laurie _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From tratt@dcs.kcl.ac.uk Wed Mar 18 14:20:13 1998 From: tratt@dcs.kcl.ac.uk (Laurence Tratt) Date: Wed, 18 Mar 1998 14:20:13 +0000 Subject: [DOC-SIG] Python Library Reference in new HTML form Message-ID: <199803181420.OAA15169@helium.dcs.kcl.ac.uk> >> Hmm... On my browser, H5 is relatively large. This is a problem; this >> sort of things is *so* browser dependent, I'm not too sure what the >> best things to do is. Perhaps H4? > There is no browser-independent way to do this. If you want to > support minority browsers, this is difficult. If you only need to > support "major" browsers (IE, NS), it not too bad, but results in poor > markup ("tag abuse"): ... (or "+2", "+1", > etc.). Perhaps the best is simply something like ... > ;-). But Strong is generally just Bold... The concern was over font size. I tend to think of font size as something of last resort... I'm taking suggestions though :) The problem is that no matter what I do, it'll never look perfect on more than one browser; it's a question of getting acceptable on as many as possible, I suppose. >> I'm not averse to using "_" although I think hard spaces >> are a little easier on the eye if your OS supports them. Perhaps two >> different versions, one with hard spaces and one with "_" should be >> included, or would people prefer just a dashed version? > Just decide on one and use it; don't have multiple versions. That > just confuses life. Well I've used WinZip on one machine, and it coped with hard space OK, and then on another it went banannas sticking accented a characters in every where :/ I like hard spaces, but it looks as if I'll need to use dashes to maintain compatibility which is not what I wanted, but could be considered typical. As a great deal of people use WinZip (I use SparkFS, some people will use gzip then tar, but not many as a percentage I suspect) on Windows machines, it looks as if I'll have to uses dashes :( >> This is, I think, one of the few (UserDict and UserLib are basically >> the same page) places where this happens. The only solution is to >> change these links after the conversion process. If it's any >> consolation, I'm not aware of any other mistakes in linking, >> although > There is one other place in the PLR where two modules share a > section, so a similar symptom might show up there. I'll move the > separation of modules in separate sections up on the priority list. It is quite confusing... There's anydbm and dumbdm, but I don't think there's any others... Quite a few lib*.tex files document more than one module within them, but that doesn't seem like a problem to me, at least. >> But I can't be sure where they will be stored on the users computer >> :( When I release the source, it should be easy to quickly alter it > I'm not sure I like the idea, though. It encourages reading the > source as an alternative to documentation, which leads to poor > documentation and too many users who misunderstand what's part of the > public interface and what isn't. I don't think I'd include it by default... It would be intimidating to novice users and a great deal of other users. Python does not need to be seen by the outside world to be "just another difficult to learn, technically minded language". But it could be an option if people want to make their own manuals... Laurie _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From Fred L. Drake, Jr." References: <199803181459.OAA20124@helium.dcs.kcl.ac.uk> Message-ID: <13583.57913.21500.471574@weyr.cnri.reston.va.us> Laurence Tratt writes: > What sort of things that were \code aren't \code any more? I have to > admit I couldn't discern any obvious rules for \ and such > like, so some rationalisation does seem sensible :) Now, there's \module, \function, \class, \cfunction, \method, \member, \exception, \keyword, \email, \url, \mimetype, .... There's also \manpage, which takes 2 parameters (name and section). This is used to avoid inconsistent presentation, and is easy to convert. \rfc, \program, \deprecated, .... \envvar is used for environment variables, and does magical indexing things. \*modindex now sets the initial "index subitem" automatically to "(in module #1)", and \renewcommand{\indexsubitem}{(stuff)} is now \setindexsubitem{(stuff)} -- there's a better chance of getting it to work with LaTeX2HTML. I'm hoping to eventually get rid of that kind of stuff, mostly so the SGML version is less polluted. There's also \modindex and \exmodindex, for use in documenting modules that aren't part of the standard library. For each \*modindex, there's a matching \ref*modindex, that adds an index entry. The \*modindex macros are now only used in the "defining" section for each module, and the page number is made bold in the index. There's a {classdesc} environment that looks just like the {funcdesc} (since that's what had been used for classes), but drops the "()" from the entry in the index, and uses an implied indexsubitem of "(class in )". I really should add {memberdesc} and {methoddesc} environments; probably will before the release, but won't get to converting content to use them yet. I've not decided on exactly the signatures they should have. \seemodule produces "hot" links in the HTML and PDF versions. If you haven't looked at the printed formats from the last release, they look a lot better. There's also some (preliminary) support for a smaller class of documents, which Andrew Kuchling and I are calling the "Python HOWTO" documents. There will be more about that in the release. > Any added index commands is almost certainly a good thing. I think > there's a lot of scope for adding lots of things to the index. There are definately more entries, but it's still a matter of adding things as we think of them; we've not "written an index" at this point. > It's Microsoft triple-sided paper. Guaranteed, and very > user-friendly. Oh, that kind. Maybe I'll pass; the third side isn't as useful if you can't flip back to the first side to refer to something. ;-) -Fred -- Fred L. Drake, Jr. fdrake@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive Reston, VA 20191 _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From tratt@dcs.kcl.ac.uk Thu Mar 19 10:05:08 1998 From: tratt@dcs.kcl.ac.uk (Laurence Tratt) Date: Thu, 19 Mar 1998 10:05:08 +0000 Subject: [DOC-SIG] Python Library Reference in new HTML form Message-ID: <199803191005.KAA22970@helium.dcs.kcl.ac.uk> >> I'm not averse to using "_" although I think hard spaces >> are a little easier on the eye if your OS supports them. Perhaps two >> different versions, one with hard spaces and one with "_" should be >> included, or would people prefer just a dashed version? > Why should one have to look at the filenames? Because it can give you a very good clue as to what page you are going to be whisked off to. Say if your cursor is over xrange, which will be a link in most circumstances, most people won't know that it's a built in function, but if you look at the link in name, it'll say: "...Built-in functions/xrange.html" which gives the game away :) Laurie _______________ DOC-SIG - SIG for the Python Documentation Project send messages to: doc-sig@python.org administrivia to: doc-sig-request@python.org _______________ From klm@python.org Thu Mar 19 23:38:32 1998 From: klm@python.org (Ken Manheimer) Date: Thu, 19 Mar 1998 18:38:32 -0500 (EST) Subject: [Doc-SIG] Test message for doc-sig Message-ID: <199803192338.SAA28327@glyph.CNRI.Reston.Va.US> This is just a test message for members of the doc-sig. I will also be sending a separate message to the gui-sig, which should help you distinguish your memberships. To change your subscription for the doc-sig, visit: http://www.python.org/mailman/listinfo/doc-sig Ken Manheimer klm@python.org 703 620-8990 x268 (orporation for National Research |nitiatives # If you appreciate Python, consider joining the PSA! # # . # From Fred L. Drake, Jr." Well, I remember saying that I'd get the docs out this week, but I'm being called out of town on fairly short notice for several days. I think next weekend is reasonable, however. The current state is that the HTML conversion has poorly formatted indexes, but that will only take a couple of hours work. These is a weird glitch in the PDF files that I don't understand yet, so that's a biggie. It may go out without new PDF files, with those following along when I get the process working better. They look fine, but the outline on the left doesn't always respond when clicked. I'll post again when I have something to report. -Fred -- Fred L. Drake, Jr. fdrake@cnri.reston.va.us Corporation for National Research Initiatives 1895 Preston White Drive Reston, VA 20191 From H.Jansen@math.tudelft.nl Mon Mar 30 21:38:17 1998 From: H.Jansen@math.tudelft.nl (Henk Jansen) Date: Mon, 30 Mar 1998 23:38:17 +0200 (METDST) Subject: [Doc-SIG] gendoc patch? Message-ID: <01IVANBU0QGI009S9M@TUDRNV.TUDelft.NL> (My) gendoc can't handle the package import structure of python 1.5. (doc_collect.py issues an error at line 533). As Mark Hammond wrote to this list on Thu, 19 Feb 1998: >... Currently gendoc has been patched >to work in 1.5 by it's author Daniel Larsson. I presume that a newer/patched version must be around somewhere. I can't download this, however (I can't access Daniel Larsson's starship/crew account). Can anyone point the to this? Thanks, Henk. -- ----------------------------------------------------------------------- | Henk Jansen hjansen@math.tudelft.nl | | Delft University of Technology P.O.Box 5031 (Mekelweg 4) | | > Information Technoloy and Systems (ITS) 2600 GA Delft | | >> Mathematics (TWI) The Netherlands | | >>> Applied Analysis (TA) phone: +31(0)15.278.7295 | | >>>> Large Scale Models (WAGM) fax: +31(0)15.278.7209 | ----------------------------------------------------------------------- From Daniel.Larsson@vasteras.mail.telia.com Tue Mar 31 12:58:13 1998 From: Daniel.Larsson@vasteras.mail.telia.com (Daniel Larsson) Date: Tue, 31 Mar 1998 14:58:13 +0200 Subject: [Doc-SIG] gendoc patch? References: <01IVANBU0QGI009S9M@TUDRNV.TUDelft.NL> Message-ID: <3520E865.65AF7459@vasteras.mail.telia.com> Hi Henk (and other interested...) Yes, there is a newer version on my Starship account. Now it even has a link on my pages... ;-)... Why can't you access it? Do you want me to email it to you? (Check http://starship.skyport.net/crew/danilo/) Henk Jansen wrote: > (My) gendoc can't handle the package import structure of python 1.5. > (doc_collect.py issues an error at line 533). As Mark Hammond wrote > to this list on Thu, 19 Feb 1998: > > >... Currently gendoc has been patched > >to work in 1.5 by it's author Daniel Larsson. > > I presume that a newer/patched version must be around somewhere. I > can't download this, however (I can't access Daniel Larsson's > starship/crew account). Can anyone point the to this? > > Thanks, From Edward Welbourne Tue Mar 31 01:56:22 1998 From: Edward Welbourne (Edward Welbourne) Date: Tue, 31 Mar 1998 02:56:22 +0100 (BST) Subject: [Doc-SIG] A documentation idea. Message-ID: Following earlier discussions about cross-reference [ notably Status of the documentation effort Information content and readability and subsequent, in our lovely new archive pages ;^] I've been thinking about what's possible and what seems like a good idea. The best way I can find to express this is as `what did I want' followed by a sketch of how. I hope they're useful ;^) Eddy -- Here's the `what I want' part: ways of indicating in the __doc__ which parts of a module (or class) I want documented, in which order: gendoc can throw in other parts of the module after the bits indicated by the __doc__, but IWBNI it could be told not to the means to let the __doc__ string refer to a variable in my namespace as a component of what is to be generated as the doc - so that part of the documentation of the module can appear further down the file than the start, just before the portion of the module to which it relates (I find docs get maintained better if they're next to the code they discuss ;^) [modulo choice of punctuation] to write (see [pack.mod.what]) in a doc string, at the end of which I indicate what [pack.mod.what] refers to: and I want to give a text (word or phrase) with that which will be used as the link-text off which the reference hangs: so the original becomes (see my choice of words) and, as is only proper for citations, the same choice of words is used in each place I've referred to [pack.mod.what]. The reason why I want that last is so that my doc string, prosaically, tells its reader (the code maintainer) what entity's documentation it is referring to, while the pretty form of it gives it a helpful name and makes it a hyperlink. Don't force the code-author to make the doc string's form of the x-ref be the actual name of a value in your namespace, just make it a recommended practice - possibly supported by some shorthands, such as allowing the bit where I say what pack.mod.what x-refs are to just say `look it up in my namespace and work out where you're documenting that'. -- Here's a sketch of the best bits I can pilfer from the discussion as to what the Xref form needs to be: illustrated by a putative file from which nearly all the code is omitted. """A 'stat()'-wrapper Module. Defines a private class used to wrap up the output of 'os.stat()' in a convenient form, and a cacheing lookup function to obtain such data. The objects returned support an 'uncache()' method which will tell this module's export, [stat], to look the relevant file up afresh next time it's asked for that file. .. +[stat] local: the 'stat()' function .. +[copyright] href \ "http://tools.py.org/contrib/toolsets/oswrap/tools/copyright.html": \ Copyleft, 1998, 2001, the python contibution project. """ import os, lazy class _Stat(lazy.Lazy): """A wrapper class for 'stat' data. Emulates a sequence, just like the data returned by 'os.stat()' (see [os.stat]), but also allows you to read data such as 'self.ownerid', rather than needing to import the 'stat' class (see [stat]) and use its data as subscripts. Uses [Lazy] so that its namespace is only as populated as you've asked for: the value of 'self.ownerid' is computed the first time you access that attribute of an object 'self' and cached in the namespace of 'self'. .. + local name_space_doc: Available Namespace .. [Lazy] global lazy.Lazy: lazy evaluation .. [os.stat] global: """ name_space_doc = """ ownerid -- the user ID of the owner of this file owner -- an [User] describing this user groupid -- the ID of the group group -- an [Group] describing this group mode -- an [Mode] describing accessibility of the file .. [User] external tools.wrap.User: object .. [Group] external tools.wrap.Group: object .. [Mode] external tools.wrap.Mode: object """ # definitions, including magic methods used by Lazy, some of which # (if and when it's needed) import tools.wrap and add it somewhere # out-of-the way in _Stat's namespace, for subsequent access to the # classes it exports. def stat(file): """Namespace-decoded 'os.stat()'. Argument, 'file', is a filename, with the usual interpretation. Raises 'os.error' (see [Exception]) if there is no such file. Otherwise, returns an object which provides status information on the file: see [return], below. See also the related [tools.wrap]. .. +[return] global _Stat: Return Type .. [tools.wrap] external: system wrapper toolset .. [Exception] builtin: """ -- So what have I used ? Putting + just after the .. says: add the indicated thing to my doc. This can have a [name] if the earlier parts of my doc want to refer to it, but it needn't: if such a name is given, it'll be used as the name of an anchor. After the [] comes a word saying how to look up the reference: it's followed by an expression that this lookup can decode, then a : and the text to use for the anchor. If the anchor text or the lookup-expression is omitted, it's the contents of the []. Styles of lookup are: global, local, builtin -- indicates which of my namespaces I'd be using to look up the reference: if no lookup-style is given, it'll be done using the usual lookup order for python evaluation (possibly bending slightly, as suggested by David Ascher). external -- you're going to have to import tools.wrap to find its docs: though you'll probably have some idiom for telling gendoc to turn that into some-root/tools/wrap.html#Begin, or similar. This fills the need to refer to something not actually accessible to my namespace: notably, in a `see also' section, one may wish to refer to things based on the present module, which won't usually be in its namespace ! href -- well, of course, we also need these. Given that they can be rather long, it's worth letting a .. entry be spread over many lines - though this probably doesn't want to be done with \, rather by noticing that the next line is more indented. -- I'm presuming a reading of a string as its own `doc string'. Just to be sure what I'm intending here, I'd want the resulting doc to begin with the file's __doc__, under the heading "A 'stat()'-wrapper module". That's followed by a subsection on the module's stat() function, under the heading "Namespace-decoded 'os.stat()'.", with cross-references to documentation of python's built-in exceptions and of a module called tools.wrap. This, in turn, has a subsubsection describing the type of stat()'s return (without the user documentation ever using the name _Stat), under the heading "Return Type". That has a sub^3^section telling us how to read the namespace of the object - and the string which does this might be borrowed by other things needing to refer to the same - furthermore, name_space_doc will show up in the namespace of each object returned by stat(), so it's a self-documenting object ;^) [A rather different use for a string like this is where it *isn't* a variable of the class, it's a short essay in the namespace of the module, to which the class doc refers rather than clutter up its body unduly.] Note that the module's documentation (when rendered to HTML) ends with Copyleft, 1998, 2001, the python contibution project. -- Details: Because tools.wrap doesn't get imported until it's needed, if at all, it isn't accessible to _Stat during a gendoc run (which won't be instanciating the class, let alone prompting it to read its data): so it has to talk about tools.wrap as external. Note that I said group -- an [Group] ... despite usual rules for a and an, because what it really says is `an object ...', with a hyperlink to Group off `object'.