From scott@chronis.icgroup.com  Sat Nov  8 04:08:28 1997
From: scott@chronis.icgroup.com (Scott)
Date: Fri, 7 Nov 1997 23:08:28 -0500
Subject: [DOC-SIG] simple but flexible text document classes
Message-ID: <19971107230828.08885@chronis.icgroup.com>


periodically I end up writing scripts or programs wchich have the need to 
automate the production of fairly simple text documents.  After some time
pondering how to make an easy to use interface to these documents while
separating the documents themselves from the control flow of the program, I
started to use the following library.

It's simple, flexible, and about as efficient as I can imagine it being
without being coded in C.

Just thought I'd post it here in case anyone would want to use it one day.


# -*-Python-*-
"""
this module defines two classes and one function.  The first class, ddoc, 
defines a document which can contain other documents.  each ddoc may be
treated like a dictionary to some degree.  When you assign a key that belongs
to any subdocuments, it will place the value where you intend it.  This
recursive structure allows for creating text documents with many different
parts put together in a structured manner.  The topmost document still allows
access and assignment to the variables in the inner documents.  The whole
thing is implemented using the format operator so it's pretty fast.

The listdoc class acts much the same way, but allows for repetition of a
single document with different values for the variables each time. With a
listdoc, you simply give the variables in the document (accessed as keys)
value of a list, and it does the rest of the work for you.

the load function facilitates storing the whole document in a separate file.
each document has a method 'save' which stores the document together with the
small amount of information needed to reload it.  the load function will load
docs that have been saved with save methods.  This doesn't use pickle or
Cpickle because part of the idea of saving the document in a separate file is
to be able to edit it without interfereing with the program(s) that use it.
"""


import string


class ddoc:
    
    error = "error"

    def __init__(self, text="", dict=None):
	self.text = text
	if not dict:
	    self.dict = {}
	else:
	    self.dict = dict

    def __repr__(self):
	return self.text % self
    
    def __setitem__(self, key, doc):
	#
	# if key is in top level values, set that value and return
	#
	for k in self.dict.keys():
	    if key == k:
		self.dict[key] = doc
		return	

	#
	# if key is in top level text, set it at top level
	#
	if string.find(self.text, '%(' + key + ')') >= 0:
	    self.dict[key] = doc

	#
	# if key in keys of each sub doc, set that sub-doc's key to val and return
	#
	for c in self.children():
	    if key in c.dict.keys():
		c[key] = doc
		return
	    if string.find(c.text, '%(' + key + ')') >= 0:
		c[key] = doc
		return
	#
	# key isn't anywhere in doc, so create it at top level
	#
	self.dict[key] = doc
    
    # 
    # return any subdoc or subsubdoc, etc  or just text as value
    #
    def __getitem__(self, key):
	try:
	    return self.dict[key]
	except KeyError:
	    if string.find(self.text, "%(" + key + ")") >= 0:
		return ''
	    for c in self.children():
		try:
		    return c[key]
		except KeyError:
		    continue
	    raise KeyError, key

		    
    def __delitem__(self, item):
	del self.dict[item]

    # 
    # totally makes object as if it were a newly created document
    #
    def clear(self):
	self.text = ""
	self.dict.clear()
    
    #
    # return all keys of self + all keys of subdocs, etc
    # because of setitem constraints, this should always be a list of unqique elements
    #
    def keys(self):
	return  self.leaves() + self.namechildren()

	
    #
    # return all subdocs, even if subsubdocs, etc
    #
    def children(self):
	d = self.dict
	c = []
	for k in d.keys():
	    if type(d[k]) is type(self):
		c.insert(0, d[k])
		c = c + d[k].children()
	return c

    def namechildren(self):
	l = []
	for k in self.dict.keys():
	    if type(self.dict[k]) is type(self):
		l.insert(0, k)
		l = l + self.dict[k].namechildren()
	return l

    #
    # return only keys that are not documents
    #
    def leaves(self):
	d = self.dict
	l = []
	for k in d.keys():
	    if type(d[k]) is type(''):
		l.insert(0,k)
	    else:
		l = l + d[k].leaves()
	return l

    def insert(self, index, text):
	start = self.text[0:index]
	finish = self.text[index:]
	self.text = "%s%s%s" % (start, text, finish)

    def set_default(self, default, *keys):
	for k in keys:
	    self[k] = default

    def has_children(self):
	return len(self.namechildren()) == 0

    def raw_texts(self):
	tlist = [("text", '', self.text)]
	for cn in self.namechildren():
	    tlist.append((cn, string.split(`self[cn].__class__`)[1], self[cn].text))
	    if self[cn].has_children():
		tlist = tlist + self[cn].raw_texts()[1:]
	return tlist

    def save(self, filename):
	fp = open(filename, "w")
	rtl = self.raw_texts()
	fp.write("\"\"\"" + rtl[0][2] + "\"\"\"\n\n")
	d = [("thisdoc", string.split(`self.__class__`)[1])]
	for n,c, rt in rtl[1:]:
	    fp.write("%s=\"\"\"%s\"\"\"\n\n\n" % (n,rt))
	    d.append((n,c))
	fp.write("\n#\n# document types -- do not edit if you don't know how\n#\n_dtypes=\\\n\t%s" % (`d`))
	fp.close()


class listdoc(ddoc):
    
    def __init__(self, text="", dict=None):
	ddoc.__init__(self, text, dict)
	self.dict['__iterator'] = 1
	self.__reg_vals = {}

    def __setitem__(self, item, val):
	if type(val) is type([]):
	    self.__reg_vals[item] = val
	    return
	ddoc.__setitem__(self, item, val)

    def __getitem__(self, item):
	if type(item) is type(0):
	    for k in self.__reg_vals.keys():
		self.dict[k] = self.__reg_vals[k][item]
	    return ddoc.__repr__(self)
	return ddoc.__getitem__(self, item)

    def __len__(self):
	max = 1
	for l in self.__reg_vals.values():
	    if len(l) > max:
		max = len(l)
	return max
	    
    def __repr__(self):
	res = ""
	for x in range(len(self)):
	    for k in self.__reg_vals.keys():
		self.dict[k] = self.__reg_vals[k][x]
	    res = "%s%s" % (res, ddoc.__repr__(self))
	return res


    def clear(self):
	ddoc.clear(self)
	self.dict['__iterator'] = 1
	self.__reg_vals.clear()


def load(filename):
    import sys
    pl = string.split(filename, "/")
    if len(pl) > 1:
	dir = string.join(pl[:-1], "/")
	sys.path.insert(0, dir)
	f = pl[-1][:-3]
    else:
	if filename[-3:] == ".py":
	    f = filename[:-3]
	else:
	    f = filename
    exec("import %s" % (f))
    exec("dtl = %s._dtypes" % (f))
    exec("thisdoc = %s()" % (dtl[0][1]))
    exec("thisdoc.text = %s.__doc__" % (f))
    for dn, dt in dtl[1:]:
	exec("%s = %s(); %s.text = %s.%s; thisdoc['%s'] = %s" % (dn,dt,dn,f,dn,dn,dn))
    return thisdoc


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Tue Nov 11 21:40:14 1997
From: papresco@technologist.com (Paul Prescod)
Date: Tue, 11 Nov 1997 16:40:14 -0500
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
References: <199711111835.NAA02624@lemur.magnet.com>
Message-ID: <3468D0BE.69C2052A@technologist.com>

Andrew Kuchling wrote:
> No markup is proposed for these features; that's been left for people
> such as Paul Prescod who actually know XML.  Probably HTML could be
> followed for these.

I agree. Or perhaps DocBook, which is a DTD designed for software
documentation and used by O'Reilly and Associates for their books.
DocBook is nice because there are already tools to convert it to HTML
and LaTeX, and of course because it makes it really easy to turn around
and hand ORA an SGML file for printing.

Now the next question is, XML or SGML. I am a long-time SGML user, but
also a member of the advisory group for XML development. Despite my
attachment to XML, it isn't clear that XML is better than Full SGML in
this context. Here are the major issues in my mind:

FULL SGML
=========
 + Minimizes typing (and "escaping") through a "tag minimization
feature"
 + (Non-Python) Tools already support it (primarily Emacs and the
commercial editors).
 + DocBook already exists
 + When XML takes over the world (in reality, not just rhetoric) we can
easily "convert to XML"
 - We depend on James Clark's C++ parsing engine "SP" to do the parsing
for us when we want to process the document using Python
 - DocBook is too complicated -- we probably want to make a subset
anyways

XML
===
 + There already exist Python parsers and these would be easy to write
if there didn't. We don't depend on anyone else (C++) to give us access
to our data.
 + We will be 100% buzz-word compliant
 - Maximizes typing (and escaping) :)
 - We must make our own DTD, or an XML-compatible variant of DocBook (or
wait for the DocBook maintainers to do so)
 
I think that when push comes to shove, whoever has to type this stuff
should vote in favour of SGML. XML means a <EMPH>lot</EMPH> of extra
typing, and SGML offers a <EMPH/lot/ of <>short cuts</>. The only real
question is whether we intend to spend a lot of time processing the
reference manual in Python and if so, whether it would be a big problem
to use a C++ program to help us. Python is a glue language after all.

Your list of features is a good start. I think that we should make a
DocBook subset that includes just those features. As we want to do more
sophisticated things, we may let it grow towards full DocBook, and
perhaps also extend it in Python-specific ways "subclass it" (in a rough
SGML-warped sense).

There also seem to be two other interesting documentation issues. 

--

The simpler one is the library reference: we may have to massively
extend DocBook for that. We also may have to do some custom programming
rather than relying on the existing tools. Neither is a big deal -- they
just mean more work for someone.

---
The more difficult issue (conceptually) is what to do about
"docstrings". 

 * Will we make them structured or leave them unstructured? 
 * Structured in 
	*SGML/XML?
	* some-adhoc language that is more "pretty"? 
	* using Python data structures?

 * How much access should the Python runtime have to that structure? 
 * How do we associate more than one "string attribute" (e.g. name vs.
description vs. see also) with each function/method/class. Maybe we need
a list of docstrings, or a dictionary.
 * How do we express emphasis, hypertext links, etc.

A related issue is whether we intend to have both a library reference
and structured docstrings? Or is the library reference just what you get
by concatenating the docstrings from the various modules? Are people
willing to make the source the documentation source too?

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Tue Nov 11 22:04:43 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Tue, 11 Nov 1997 17:04:43 -0500
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
In-Reply-To: <3468D0BE.69C2052A@technologist.com>
References: <199711111835.NAA02624@lemur.magnet.com>
 <3468D0BE.69C2052A@technologist.com>
Message-ID: <199711112204.RAA22015@weyr.cnri.reston.va.us>

Paul Prescod writes:
 > I agree. Or perhaps DocBook, which is a DTD designed for software
 > documentation and used by O'Reilly and Associates for their books.
 > DocBook is nice because there are already tools to convert it to HTML
 > and LaTeX, and of course because it makes it really easy to turn around
 > and hand ORA an SGML file for printing.

  These are major benefits of docbook in my mind.

 > Now the next question is, XML or SGML. I am a long-time SGML user, but
...
 > I think that when push comes to shove, whoever has to type this stuff
 > should vote in favour of SGML. XML means a <EMPH>lot</EMPH> of extra

  I'm in favor of SGML for this; XML should not be difficult to
generate if needed, but SGML allows all the flexibility we might think 
we need in the future.  At this point, it doesn't look like there's a
lot of experience with *large* XML documents, though I'm sure I've
overlooked something somewhere, and there are a few.  The ability to
use </> is a major win in my book.
  I expect initial conversion of the Library Reference could be done
using Python, and then "fixed up" manually using an editor like XEmacs 
(or Emacs) and PSGML mode.
  Regarding processing, I'd have no problems using SP to do this; a
Python interface to the generic interface would not be difficult to
create, if a little tedious.  I'm willing to do this, but it would be
evenings / weekends, and only if it'll get used.

 > Your list of features is a good start. I think that we should make a
 > DocBook subset that includes just those features. As we want to do more
 > sophisticated things, we may let it grow towards full DocBook, and
 > perhaps also extend it in Python-specific ways "subclass it" (in a rough
 > SGML-warped sense).

  Paul, do you know of any good material *about* docbook for those not 
familiar with the DTD?  I've read the 2.4.1(?) reference manual (not
recently), but would really like to find something at a higher level
and for the current version.

 > The more difficult issue (conceptually) is what to do about
 > "docstrings". 
 > 
 >  * Will we make them structured or leave them unstructured? 
 >  * Structured in 
 > 	*SGML/XML?
 > 	* some-adhoc language that is more "pretty"? 
 > 	* using Python data structures?

  I'd like to see two things:  human parsable docstrings (*not* any
heavy markup), perhaps using a few conventions to distinguish things.
A number of people have tried to define conventions and write
documentation extractors, but I think it boils down to several things:

  - There's a real desire for some measure of structure when
    generating "book-like" documentation (the library reference), and
    people still want something like that to print.

  - There's a need for some quick-access documentation in the
    sources.

  - Nobody seems to agree on what should go into the docstrings,
    either in content or form.  (Content is usually less of an issue
    than form.)  What comes out is really irrelevant, since different
    tools can be written to get the "right stuff" as long as it went
    in to start with.

 > A related issue is whether we intend to have both a library reference
 > and structured docstrings? Or is the library reference just what you get
 > by concatenating the docstrings from the various modules? Are people
 > willing to make the source the documentation source too?

  I don't think that having both will be avoidable in the near term
(10 years) if we really want heavily structured data.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Tue Nov 11 22:14:03 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Tue, 11 Nov 1997 17:14:03 -0500
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
In-Reply-To: Your message of "Tue, 11 Nov 1997 16:40:14 EST."
 <3468D0BE.69C2052A@technologist.com>
References: <199711111835.NAA02624@lemur.magnet.com>
 <3468D0BE.69C2052A@technologist.com>
Message-ID: <199711112214.RAA18254@eric.CNRI.Reston.Va.US>

> I think that when push comes to shove, whoever has to type this stuff
> should vote in favour of SGML. XML means a <EMPH>lot</EMPH> of extra
> typing, and SGML offers a <EMPH/lot/ of <>short cuts</>.

Let me request a reality check here, before you guys get all carried
away.

Either choice sounds really bad to me.  I've come to really hate the
idea of having to type raw SGML.  For me, SGML is great as an
intermediate format -- I can generate it and I can parse it.  But I
don't want to type it.  It sounds like XML is no better.

There is an existing standard for doc strings (although almost nobody
uses it), I believe it's called "stext", which minimizes markup.  For
me, personally, doc strings are usually just specialized comments, and
the more markup they contain, the less readable they are.  I don't
like to read raw HTML, and expect that raw XML would be just as bad.

Sorry, just my two pennies,

--Guido van Rossum (home page: http://www.python.org/~guido/)


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From amk@magnet.com  Tue Nov 11 22:58:17 1997
From: amk@magnet.com (Andrew Kuchling)
Date: Tue, 11 Nov 1997 17:58:17 -0500 (EST)
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
In-Reply-To: <3468D0BE.69C2052A@technologist.com> (message from Paul Prescod
 on Tue, 11 Nov 1997 16:40:14 -0500)
Message-ID: <199711112258.RAA12126@lemur.magnet.com>

Paul Prescod <papresco@technologist.com> wrote:
>... Despite my
>attachment to XML, it isn't clear that XML is better than Full SGML in
>this context. Here are the major issues in my mind:
	...
	Hmm.  DocBook looks interesting
(http://www.oreilly.com/davenport/), but it is, as you say, pretty
darned big, and one would like to simplify things by switching from
LaTeX.  I'm also not sure if the tools to manipulate DocBook are
freeware, or if you're expected to use the DocBook DTD in commercial
SGML editing tools.

>A related issue is whether we intend to have both a library reference
>and structured docstrings? Or is the library reference just what you get
>by concatenating the docstrings from the various modules? Are people
>willing to make the source the documentation source too?

	In general I think docstrings and a reference manual are two
different things.  The LibRef is supposed to be fairly complete, and
may include sample code, stylistic comments ("You can override this
method, but it's a bad idea; do this instead...").  Docstrings are
intended to be displayed by class browsers and used as comments, and
every byte in a docstring is dragged around at runtime, so you don't
want them to get too bloated.  Making a LibRef out of docstrings will
result in either huge docstrings, or a too-terse LibRef.  

	This means we don't need to use XML/SGML markup in docstrings,
and it isn't necessary that everything can be done in pure Python
(though it would be nice, and probably better, IMHO, since you'd have
to install less software).


	Andrew Kuchling
	amk@magnet.com
	http://starship.skyport.net/crew/amk/


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From amk@magnet.com  Tue Nov 11 23:08:39 1997
From: amk@magnet.com (Andrew Kuchling)
Date: Tue, 11 Nov 1997 18:08:39 -0500 (EST)
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
In-Reply-To: <199711112214.RAA18254@eric.CNRI.Reston.Va.US> (message from
 Guido van Rossum on Tue, 11 Nov 1997 17:14:03 -0500)
Message-ID: <199711112308.SAA12441@lemur.magnet.com>

Guido van Rossum <guido@CNRI.Reston.Va.US> wrote:
>Either choice sounds really bad to me.  I've come to really hate the
>idea of having to type raw SGML.  For me, SGML is great as an
>intermediate format -- I can generate it and I can parse it.  But I
>don't want to type it.  It sounds like XML is no better.

	Are you against SGML/XML just when used in docstrings, or also
if used for the LibRef/tutorial/etc?

	A better solution might be something similar to Perl's pod
format; something simple and easily parsed with regular expressions.
(For example, "=head1 Level 1 heading".)  Adding new markup keywords
would then require modifying Python code, but writing documents would
be much easier, and doable in plain text mode.


	Andrew Kuchling
	amk@magnet.com
	http://starship.skyport.net/crew/amk/


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Wed Nov 12 00:13:46 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Tue, 11 Nov 1997 16:13:46 PST
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
In-Reply-To: <3468D0BE.69C2052A@technologist.com>
References: <199711111835.NAA02624@lemur.magnet.com>
 <3468D0BE.69C2052A@technologist.com>
Message-ID: <koODGuoB0KGWMpnH8v@holmes.parc.xerox.com>

I seem to keep having to say this every couple of years...

Just use TIM
(ftp://ftp.parc.xerox.com/pub/ilu/2.0a11/manual-html/manual_21.html),
which is a Texinfo-based system that produces either text, Postscript,
or HTML.  It can be easily munged to produce XML instead of, or in
addition to, HTML, if desired.  In distinction to Texinfo, TIM supports
pictures and URLs just fine, and supports application-specific generic
markup (you can say @French{mais oui!} instead of @i{mais oui!}), it's
easier to type than XML (@emph{indeed!} instead <EMPH>indeed</EMPH>),
it's free, and the tool chain already works.

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@calum.csclub.uwaterloo.ca  Wed Nov 12 00:30:49 1997
From: papresco@calum.csclub.uwaterloo.ca (Paul Prescod)
Date: Tue, 11 Nov 1997 19:30:49 -0500 (EST)
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
In-Reply-To: <199711112214.RAA18254@eric.CNRI.Reston.Va.US> from "Guido van Rossum" at Nov 11, 97 05:14:03 pm
Message-ID: <199711120030.TAA09239@calum.csclub.uwaterloo.ca>

> 
> > I think that when push comes to shove, whoever has to type this stuff
> > should vote in favour of SGML. XML means a <EMPH>lot</EMPH> of extra
> > typing, and SGML offers a <EMPH/lot/ of <>short cuts</>.
> 
> Let me request a reality check here, before you guys get all carried
> away.
> 
> Either choice sounds really bad to me.  I've come to really hate the
> idea of having to type raw SGML.  For me, SGML is great as an
> intermediate format -- I can generate it and I can parse it.  But I
> don't want to type it.  It sounds like XML is no better.
> 
> There is an existing standard for doc strings (although almost nobody
> uses it), I believe it's called "stext", which minimizes markup.  

At the point that you have commented upon, we were discussing the library
reference, which is now in some TeX variant, right?

 Paul Prescod


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From raf@comdyn.com.au  Wed Nov 12 02:37:46 1997
From: raf@comdyn.com.au (raf)
Date: Wed, 12 Nov 1997 13:37:46 +1100
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
Message-ID: <199711120237.NAA13832@mali.cd.comdyn.com.au>

>>A related issue is whether we intend to have both a library reference
>>and structured docstrings? Or is the library reference just what you get
>>by concatenating the docstrings from the various modules? Are people
>>willing to make the source the documentation source too?

>	In general I think docstrings and a reference manual are two
>different things.  The LibRef is supposed to be fairly complete, and
>may include sample code, stylistic comments ("You can override this
>method, but it's a bad idea; do this instead...").  Docstrings are
>intended to be displayed by class browsers and used as comments, and
>every byte in a docstring is dragged around at runtime, so you don't
>want them to get too bloated.  Making a LibRef out of docstrings will
>result in either huge docstrings, or a too-terse LibRef.  

>	This means we don't need to use XML/SGML markup in docstrings,
>and it isn't necessary that everything can be done in pure Python
>(though it would be nice, and probably better, IMHO, since you'd have
>to install less software).

Literate programming is essential, though, if you want any hope of the
documentation matching the implementation.

When python code is written, the docstrings must be there, but they could
be stripped out when 'installing' the module, just as the code is
'stripped out' when producing the documentation.

Perhaps each docstring could have a small component that is kept at run-time.

raf


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Wed Nov 12 03:49:20 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Tue, 11 Nov 1997 22:49:20 -0500
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
In-Reply-To: <199711120237.NAA13832@mali.cd.comdyn.com.au>
References: <199711120237.NAA13832@mali.cd.comdyn.com.au>
Message-ID: <199711120349.WAA22713@weyr.cnri.reston.va.us>


raf writes:
 > Literate programming is essential, though, if you want any hope of the
 > documentation matching the implementation.

  While I think literate programming is an interesting, valid, and
useful approach, I don't think that it is "essential", as you assert.
It is *possible* to produce effective and accurate documentation
separately from the source code, and also possible to produce
incorrect documentation using literate programming techniques.  How
many times have you found code that doesn't match the comments and
docstrings which accompany the executable statements?  This has
certainly proven to be a problem in every large project I've had to
read the code for, and is very easy to have happen in a highly dynamic 
environment.  I think it safe to say that Python qualifies for that,
especially for those of us using it in research.

 > When python code is written, the docstrings must be there, but they could
 > be stripped out when 'installing' the module, just as the code is
 > 'stripped out' when producing the documentation.

  I find that I use docstrings primarily when attempting to understand 
code; I don't actually use it often in a running interpreter.  Perhaps 
as more development environments become available this will change,
but I'm not sure that the docstrings should be the sole or primary
source of documentation.  An environment which incorporates external
reference material through a well-designed hypertext model would go a
long way to providing the kind of support I'd like to see, but the
whole thing would need to be suitable for large projects before I'd
find it usable at all.  So my inclination is to support structured
documentation as a separate but accessible component of the system;
multiple forms of access can be provided to support multiple needs.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@calum.csclub.uwaterloo.ca  Wed Nov 12 03:58:37 1997
From: papresco@calum.csclub.uwaterloo.ca (Paul Prescod)
Date: Tue, 11 Nov 1997 22:58:37 -0500 (EST)
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
In-Reply-To: <koODGuoB0KGWMpnH8v@holmes.parc.xerox.com> from "Bill Janssen" at Nov 11, 97 04:13:46 pm
Message-ID: <199711120358.WAA16054@calum.csclub.uwaterloo.ca>

> Just use TIM
> (ftp://ftp.parc.xerox.com/pub/ilu/2.0a11/manual-html/manual_21.html),
> which is a Texinfo-based system that produces either text, Postscript,
> or HTML.  It can be easily munged to produce XML instead of, or in
> addition to, HTML, if desired.  In distinction to Texinfo, TIM supports
> pictures and URLs just fine, and supports application-specific generic
> markup (you can say @French{mais oui!} instead of @i{mais oui!}), it's
> easier to type than XML (@emph{indeed!} instead <EMPH>indeed</EMPH>),
> it's free, and the tool chain already works.

SGML has all of the benefits that you describe 
 * Generic Markup
 * <emph/shortforms/
 * high quality free software (is there a TIM editor? how do I get TIM data into FrameMaker?)
 * SGML is not just trivially extensible in the TeX sense, but it
 has a language for enforcing the structure of your extensions for
 (e.g.) complex descriptions of library components or whatever else.

But more subtly, it is good to use SGML over TeX variants for the same reason
that it is good to use ILU over language-specific extension mechanisms 
(where possible). Because it does the job and it is based on industry (or in
SGML's case, international) standards and we can save time and money by
sticking to standards rather than using proprietary technologies. We've got
to stop reinventing wheels, rewriting parsers etc.

I am personally reluctant to get involved with Yet Another TeX variant. I
grieve the megabytes of documents stuck in these formats. If the TIM 
concepts and/or code are good, we can incorporate them into an SGML based 
system just as TeX and HTML are routinely incorporated. 

 Paul Prescod


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@calum.csclub.uwaterloo.ca  Wed Nov 12 04:05:34 1997
From: papresco@calum.csclub.uwaterloo.ca (Paul Prescod)
Date: Tue, 11 Nov 1997 23:05:34 -0500 (EST)
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
In-Reply-To: <199711120030.TAA09239@calum.csclub.uwaterloo.ca> from "Paul Prescod" at Nov 11, 97 07:30:49 pm
Message-ID: <199711120405.XAA16377@calum.csclub.uwaterloo.ca>

> At the point that you have commented upon, we were discussing the library
> reference, which is now in some TeX variant, right?

Sorry, I meant "tutorial" above, not library reference.

 Paul Prescod


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Wed Nov 12 05:51:00 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 12 Nov 1997 00:51:00 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
Message-ID: <346943C3.91CCF8FC@technologist.com>

If we do decide to move the tutorial from LaTeX to SGML, then we must
choose a DTD.

TEILite
=======
TEILite is a subset of the Text Encoding Initiative's TEI DTD. This DTD
is designed for academic work -- scholarly text analysis, but TEI Lite
strips most of that junk out and leaves something at the semantic level
of LaTeX but with real, software-enforced structure. It still has things
we don't need (of course) so we could still make a subset, but it is in
the ballpark:

http://www-tei.uic.edu/orgs/tei/intros/teiu5.html

It wasn't designed specifically for software manuals, but it has enough
to handle what I have seen of the tutorial (keyword, example, etc.)

http://www-tei.uic.edu/orgs/tei/intros/teiu5.html#TECHDOC

I have a lot of experience with TEILite and am writing a computer book
in a variant of it. The most interesting free software for dealing with
TEI are the tools for creating print and web documents with the Jade
SGML processing engine. There are also Perl-based translators, just in
case those happen to do something better.

TEILite was painstakingly designed by smart people and I think it is
quite good.

DocBook
=======
DocBook is large and powerful. I don't know if a smaller subset exists,
bue I'm looking into it. The most interesting free software for dealing
with DocBook are the tools for creating print and web documents with
Jade. 

http://www.berkshire.net/~norm/dsssl/

These are maintained by Norm Walsh formerlly of ORA now at SGML vendor
ArborText (SGML people shuffle around a lot). He is also writing
"DocBook in a Nutshell". Maybe DocBook is too complicated for a first
cut at SGML documentation (or even a subsequent cut). We don't need
things like callouts, procedure lists, sidebars, and so forth.

Anyhow, the DocBook tutorial is at http://www.oreilly.com/davenport/

LinuxDoc/SGML-Tools
===================
This DTD was called "LinuxDoc" and has been renamed "SGML Tools". My
concern about it is that the people who maintain the "SGML Tools"
software package are a bunch of Perl/C/Awk hackers, and if that wasn't
enough to make you worry, they ignore what I consider to be the coolest
tool for SGML processing ever invented -- Jade -- which is the fastest
way to turn SGML documents into beautiful print pages and web pages.
Jade isn't "manly enough" because it uses Scheme as an expression
language and everyone knows that people only use those kinds of
languages in research labs. Since I think a Jade-based approach is 10
times easier to maintain, I would probably not use any of their software
unless it did exactly what we wanted right out of the box. They seem to
spend most of their time chasing down problems that Jade would solve for
them.

ANYHOW, the SGML Tools DTD is much like LaTeX redone in SGML. You can
decide for yourself if that is a good or bad thing. I see you guys have
considered SGML Tools before, in a thread going in exactly the same
direction this February. :) Also, SGML-Tools has a WYSIYG editor in Lyx
-- again you can decide for yourself if that is good or bad.

One other problem these SGML Tools guy also have a VERY unix-focussed
outlook last I checked. I don't think our documentation system should
depend on anything more than Python and SP both of which run on Windows,
Unix and OS/2 (don't know about Mac).

http://www.sil.org/sgml/publicSW.html#linuxdoc

Conclusion
==========
Python wasn't built in a day. I think that TEILite is a nice, manageable
DTD that has all of the features we need. DocBook seems like it is
designed for what we want to do, but it looks like it is overkill for
today. Maybe it will be appropriate for the LibRef. Maybe it will still
be overkill.

>From TEILite we can immediately get:

 * TeX
 * FrameMaker MIF
 * Windows RTF
 * Postscript (from any of the above)
 * as new Jade back-ends are written, we get them "for free"

Those all come "for free" from a single Jade stylesheet. We can also get
HTML through a TEILite->HTML stylesheet that I am 70% of the way done
writing. (you can't get HTML for free because it is so different from
printed formats) I also have a Python parser for NSGMLS's output format
for when we want to do complex things with the docs.

Nobody has yet done the task of SWIGging (or ILUing) SP which would
allow us higher performance access to its internal data. On Windows, you
can do all of that magic through OLE, but we obviously can't depend on
that.

I think we are further ahead than we were last February. There are at
least two hard-core SGML users in the group who can help to customize
the DTD and improve converters should we need them, more Python software
for dealing with SGML and Jade/nsgmls have improved too.

Anyhow, I think the next step is to gather consensus on an SGML-based
plan. After that, we would all install Jade (which includes "nsgmls")
and perhaps the SGML extension for Emacs and start to convert the
tutorial to TEILite.

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Wed Nov 12 05:55:22 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 12 Nov 1997 00:55:22 -0500
Subject: No subject
Message-ID: <346944CA.1FD8D804@technologist.com>

> Hmm. DocBook looks interesting 
> (http://www.oreilly.com/davenport/), but it is, as you say, pretty 
> darned big, and one would like to simplify things by switching from 
> LaTeX. I'm also not sure if the tools to manipulate DocBook are 
> freeware, or if you're expected to use the DocBook DTD in commercial 
> SGML editing tools.
 
No, the three most used SGML DTDs are TEI, DocBook and HTML and all
three have free processing tools -- mostly Jade stylesheets.
 
> In general I think docstrings and a reference manual are two different things. 

I agree. Let's not worry about docstrings right now. Reforming the
tutorial will be a big enough job.

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Wed Nov 12 14:11:05 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 12 Nov 1997 09:11:05 -0500
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
References: <199711111835.NAA02624@lemur.magnet.com>
 <3468D0BE.69C2052A@technologist.com> <199711112204.RAA22015@weyr.cnri.reston.va.us>
Message-ID: <3469B8F9.32838B0@technologist.com>

Fred L. Drake wrote:
>   I expect initial conversion of the Library Reference could be done
> using Python, and then "fixed up" manually using an editor like XEmacs
> (or Emacs) and PSGML mode.

I think we should tackle the tutorial first, as it will require less
custom markup and programming.

>   Regarding processing, I'd have no problems using SP to do this; a
> Python interface to the generic interface would not be difficult to
> create, if a little tedious.  I'm willing to do this, but it would be
> evenings / weekends, and only if it'll get used.

If you do this, I would strongly encourage you to skip the Generic
Interface and move to the more poweful Grove Interface. On Windows, this
interface is already available through COM. I think it is relevant that
the grove is the only interface that James Clark explicitly chose to
publish for scripters (Windows scripters). Note also that the grove
interface is explicitly designed for "wrapping" as James did with COM.
The Perlers have also made a wrapper for the grove interface, so we've
got an "interface gap" there. The SGML Tools (was LinuxDoc) project is
moving to the Perl/Grove interface, I think. 

The grove interface is more powerful because it provides access to the
entire SGML document at once. This is an awesome amount of context and
it makes life much easier. Rather than squirreling away little bits of
information to be used somewhere else, you can make a function to look
them up in the grove. Also, it makes forward references a hundred times
easier. The days of running a program six times to resolve references
are past, I think.

It might seem that this would take up too much RAM, but experience shows
it to be very efficient. On modern computers, it is probably much more
efficient than the multiple-pass processes of the past.

Using the grove interface is one of the major things that makes Jade so
much easier to use for processing SGML documents than Perl/Python. The
grove interface is one of the major arguments of favour of SGML over
something macro based. I've got a simple Python Library that implements
a subset of Jade's grove interface, but it is written entirely in Python
and it's slow. I could speed it up with optimization, but would rather
just make the grove wrapper when I get time to digest all of the
interfaces.

But anyhow, cool as the grove interface is, it isn't clear yet that we
need any interface for this particular project. Hopefully we can depend
on the existing tools (Jade and existing stylesheets). Once we want to
go beyond their capabilities, we must decide whether to extend Python to
be as powerful as Jade (through wrappers to Jade's internals) or just
use Jade/Scheme. My dream come true would be to subclass Jade's
interpreter object and allow it to read in Python classes and use them
to completely drive the processing (instead of the usual Scheme files).

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Wed Nov 12 14:41:43 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Wed, 12 Nov 1997 09:41:43 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: Your message of "Wed, 12 Nov 1997 00:51:00 EST."
 <346943C3.91CCF8FC@technologist.com>
References: <346943C3.91CCF8FC@technologist.com>
Message-ID: <199711121441.JAA00616@eric.CNRI.Reston.Va.US>

> If we do decide to move the tutorial from LaTeX to SGML, then we must
> choose a DTD.

I'm sorry, but I'm the only one who can decide to move the tutorial
anywhere.  I have already expressed my sentiments towards having to
enter SGML manually.  Please concentrate on the task at hand,
e.g. the library reference manual.

--Guido van Rossum (home page: http://www.python.org/~guido/)


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Wed Nov 12 16:22:30 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 12 Nov 1997 11:22:30 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
References: <346943C3.91CCF8FC@technologist.com> <199711121441.JAA00616@eric.CNRI.Reston.Va.US>
Message-ID: <3469D7C5.F90F32F9@technologist.com>

Guido van Rossum wrote:
> 
> > If we do decide to move the tutorial from LaTeX to SGML, then we must
> > choose a DTD.
> 
> I'm sorry, but I'm the only one who can decide to move the tutorial
> anywhere.  I have already expressed my sentiments towards having to
> enter SGML manually.  Please concentrate on the task at hand,
> e.g. the library reference manual.

I must admit to being totally confused on several issues:

 * The title of the thread is "notes on the tutorial's markup" -- isn't
the tutorial then the task at hand? I'm having trouble reconstructing
the thread. I thought that was proposed as a test case by Andrew
Kuchling. I also thought that Andrew was the tutorial maintainer (from
some old messages in the thread). You can see how I came to the
conclusion that he had the (moral) authority to move it to SGML.

 * Don't you (Guido) also have to edit the library reference manual? If
SGML is not appropriate for the tutorial, wouldn't it be similarly
inappropriate for the reference manual? I would have thought that the
library reference manual is edited a lot more often than the tutorial,
in fact. Or are you proposing that we use SGML as a step in a change
from some TeX variant?

 * Does it makes sense to think about the library reference in
isolation? Do we really plan to keep the two docs in different formats
indefinately? I'm not sure I would promote SGML if it means that forever
after people will have to install two different tool chains to process
the various Python docs. Or do the same people usually not work on the
different documents?

 * What exactly is the concern about SGML? From what I have seen, SGML
markup can be fully as <emph/minimal/ as @prod{TeX} variants. I'm afraid
that XML and HTML give people the wrong impression tht SGML must be
verbose and use redundant end tags.

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From fredrik@pythonware.com  Wed Nov 12 16:27:42 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 12 Nov 1997 17:27:42 +0100
Subject: SV: [DOC-SIG] Comparing SGML DTDs
Message-ID: <9711121628.AA05017@arnold.image.ivab.se>

>  * What exactly is the concern about SGML? From what I have seen, SGML
> markup can be fully as <emph/minimal/ as @prod{TeX} variants. I'm afraid
> that XML and HTML give people the wrong impression tht SGML must be
> verbose and use redundant end tags.

Maybe anyone could post some small examples of DocBook and whatever
the other DTD's are called.  The SGML and XML material I've seen tend to
spend very little time dealing with how real documents and stylesheets
look like...

And I assume there tools around that makes conversion to/from (la)tex
nearly automatic?

Cheers /F


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From fredrik@pythonware.com  Wed Nov 12 16:38:29 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 12 Nov 1997 17:38:29 +0100
Subject: [DOC-SIG] Comparing SGML DTDs
Message-ID: <9711121638.AA05851@arnold.image.ivab.se>

> Maybe anyone could post some small examples of DocBook and whatever
> the other DTD's are called.

Eh. I meant samples of documents using these DTD's...

Sorry /F


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Wed Nov 12 17:04:24 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 12 Nov 1997 12:04:24 -0500
Subject: [DOC-SIG] Non-Unix platforms
Message-ID: <3469E198.81B73241@technologist.com>

> Where can I obtain an OS/2 (or 16-bit Windows) version of Jade?

http://www.mulberrytech.com/dsssl/dssslist/archive/1257.html

Jade is written in the ISO C++ subset supported by GCC. I think it "just
compiles" on any platform with GCC, and also with some other compilers.

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From amk@magnet.com  Tue Nov 11 18:35:53 1997
From: amk@magnet.com (Andrew Kuchling)
Date: Tue, 11 Nov 1997 13:35:53 -0500 (EST)
Subject: [DOC-SIG] [XML] Notes on the Tutorial's markup
Message-ID: <199711111835.NAA02624@lemur.magnet.com>

[This shouldn't be restricted to psa-members, and belongs on the
doc-sig@python.org; please direct your followups there.]

Recently it was proposed to use a XML markup for the Python
documentation.  This is an excellent idea, so here are some notes on
the LaTeX features used by the tutorial, and by api.tex, ext.tex --
they're relatively straightforward documents which only use a little
simple formatting.  I've dodged the really complicated question, which
is the Library Reference.  

No markup is proposed for these features; that's been left for people
such as Paul Prescod who actually know XML.  Probably HTML could be
followed for these.

Markup for tut.tex et al.:
==========================
Header stuff:
	Document title
	Abstract
	Author / copyright 
	Date

Section organization:
	Chapters, sections, subsections, sub-subsections 
		(H1 through H4, in HTML terms)

Lists: Both enumerated (1,2,3...) & not (bullets).	

Text markup:
	Emphasized text
	Footnotes
	URLs, e-mail addresses
	Filenames: {\tt /usr/local/bin/python}
	Including Python code in running text: {\tt a+b}
	Verbatim inclusions of Python code in separate chunks:
		\bcode\begin{verbatim}
		>>> width = 20
		>>> height = 5*9
		>>> width * height
		900
		\end{verbatim}\ecode
	Ideally, you wouldn't have to escape too many characters
	inside these chunks. 

Indexing: 
	Need a way to indicate relevant indexing keywords for each
paragraph (indexing doesn't seem to be easily automated)


	Andrew Kuchling
	amk@magnet.com
	http://starship.skyport.net/crew/amk/

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Wed Nov 12 17:14:32 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 12 Nov 1997 12:14:32 -0500
Subject: SV: [DOC-SIG] Comparing SGML DTDs
References: <9711121628.AA05017@arnold.image.ivab.se>
Message-ID: <3469E3F8.8C4BAD63@technologist.com>

Fredrik Lundh wrote:
> Maybe anyone could post some small examples of DocBook and whatever
> the other DTD's are called.  The SGML and XML material I've seen tend to
> spend very little time dealing with how real documents and stylesheets
> look like...

Good question. I can find examples, but they are all "normalized" (tag
expanded). Normalized SGML is often distributed to support a few broken
tools that don't know about minimization (primarily Panorama). I could
perhaps encode a few pages in a DocBook subset over the weekend as an
example.
 
> And I assume there tools around that makes conversion to/from (la)tex
> nearly automatic?

To TeX, yes, no problem.

>From (La)TeX? Not really. Parsing LaTeX is not only difficult, but
relatively undefined. There is no one language called "LaTeX" it is
really a family of languages more or less defined by the Lamport book.
And lots of LaTeX documents mix generic structures and formatting
interchangably. Finally, there is no easy way to figure out how to
handle macros. Should they be expanded to their TeX primitives (uck)? If
not, how do we know how to represent user defined macros in the target
DTD? If you make a "foobar" macro, what do I do with it in SGML?

Now you see why I really hate to see documents trapped in LaTeX -- it
mixes so many concepts -- markup and formatting, structure and macros,
etc. I think you almost have to write a converter specific to a specific
LaTeX subset, defined a) in prose text or b) by a specific set of
documents. We could analyze the LaTeX conventions used by the Python
documents and write a converter, but I wouldn't advise that as a long
term solution. There is no way to force people to stick to our LaTeX
conventions.

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Wed Nov 12 17:22:11 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Wed, 12 Nov 1997 12:22:11 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: Your message of "Wed, 12 Nov 1997 11:22:30 EST."
 <3469D7C5.F90F32F9@technologist.com>
References: <346943C3.91CCF8FC@technologist.com> <199711121441.JAA00616@eric.CNRI.Reston.Va.US>
 <3469D7C5.F90F32F9@technologist.com>
Message-ID: <199711121722.MAA01255@eric.CNRI.Reston.Va.US>

[me]
> > I'm sorry, but I'm the only one who can decide to move the tutorial
> > anywhere.  I have already expressed my sentiments towards having to
> > enter SGML manually.  Please concentrate on the task at hand,
> > e.g. the library reference manual.

[paul prescod]
> I must admit to being totally confused on several issues:
> 
>  * The title of the thread is "notes on the tutorial's markup" -- isn't
> the tutorial then the task at hand? I'm having trouble reconstructing
> the thread. I thought that was proposed as a test case by Andrew
> Kuchling. I also thought that Andrew was the tutorial maintainer (from
> some old messages in the thread). You can see how I came to the
> conclusion that he had the (moral) authority to move it to SGML.

Andrew is not the tutorial maintainer -- he has helped me tremendously
with reorganizing, but I still consider myself the author, and I think
that Andrew agrees with this view.

The thread started out in the PSA list with the subject "Python
Documentation Idea"; that thread definitely started off with doc
strings in mind (for example, there's a whole discussion on whether
gendoc is or isn't broken in that thread).

You were the first to bring up XML/SGML in that thread:

| Subject: Re: [PSA MEMBERS] Python Documentation Idea
| From: Paul Prescod <papresco@technologist.com>
| To: "psa-members@python.org" <psa-members@python.org>
| Date: Sat, 08 Nov 1997 07:06:24 -0500
| 
| Jeff Rush wrote:
| > 
| > It sounds like people want very different things.  Some want Tex output to
| > print (John S.  Zhao), some want lots of HTML files and some want an
| > interactive, command-line browser system (S. Hoon Yoon).  Actually, I'm
| > closer to Yoon's idea, but more of a hypertext reference manual and less
| > like a traditional class browser.
| 
| It is important to note that there are two standards designed
| specifically to address these three problems (and more). They are SGML,
| the international standard, and XML, the simple subset designed
| specifically for the World Wide Web by W3C, the same people who design
| new versions of HTML and other Web specifications. As an SGML/XML
| consultant and advocate, I naturally think that XML should be adopted,
| but I also think that XML makes perfect sense to someone who looks at
| the system in an unbiased way.
[...]

This still clearly refers to the library manual if you ask me; Andrew
Kuchling replied the same day with

| 	Writing new library reference documentation is something of a
| pain, because there's no special formatting for optional arguments (in
| general), default values, or keyword arguments.  Usually the
| information about default arguments and the like is simply placed in
| the text, but it's not very interesting to write or to read, and isn't
| understandable at a glance.  An XML-based scheme would make this
| information available for special formatting, and to programming
| environments.

Later (on Tue Nov 11) Andrew changed directions and subject and
somehow chose to change the subject to "[XML] Notes on the Tutorial's
markup".  This was the first crosspost to the doc sig.  I must have
missed the change of subject -- in my mind it was still about the
library and doc strings.

>  * Don't you (Guido) also have to edit the library reference manual? If
> SGML is not appropriate for the tutorial, wouldn't it be similarly
> inappropriate for the reference manual? I would have thought that the
> library reference manual is edited a lot more often than the tutorial,
> in fact. Or are you proposing that we use SGML as a step in a change
> from some TeX variant?

The difference is that the library *really* needs more structure than
latex can provide, so I'm open to suggestions.  It is also has many
contributed chapters -- ideally, anybody contributing source code
should also contribute a corresponding library section.

I think that SGML is not fit to be typed by humans -- especially since
it has so many special characters that conflict with characters that
are significant in Python code.  (SGML was designed to be typed by
humans in the age of punched cards.)  Latex has the same problem
(especially the underscore is painful).  I think something else should
be used that can be converted to SGML (or XML for all I care).  TIM,
which has only one magic character (@, which isn't used in Python)
fits the bill -- it did one or two years when I looked into it, and
it's only because of inertia (and a lot of other things that needed to
happen sooner) that I haven't started using it.

>  * Does it makes sense to think about the library reference in
> isolation? Do we really plan to keep the two docs in different formats
> indefinately? I'm not sure I would promote SGML if it means that forever
> after people will have to install two different tool chains to process
> the various Python docs. Or do the same people usually not work on the
> different documents?

Yes, I think the library reference is a separate project from the
tutorial.  I am planning to do the tutorial in FrameMaker because it
gives me as an author the best user interface for editing and the most
freedom to create nice layout, and because it is essentially a
one-author document it's no problem that not everybody can afford
FrameMaker (as long as I can generate HTML and PostScript, which I can
-- and there's even a version of Frame that can generate SGML although
I don't have it).  (Now that I've got a PC at home I may switch to MS
Word too -- that's surely democratic :-)

>  * What exactly is the concern about SGML? From what I have seen, SGML
> markup can be fully as <emph/minimal/ as @prod{TeX} variants. I'm afraid
> that XML and HTML give people the wrong impression tht SGML must be
> verbose and use redundant end tags.

I just don't like the fact that SGML makes characters that occur
frequently in Python source code like "<" and "/" special.  Also the
fact that SGML parsers that support the full syntax are either costly
in money or in resources (few sites that I know have an SGML parser
installed already; sgmllib.py doesn't cut it).  TIM, on the other
hand, was *designed* to be trivial to parse, so you can quickly write
a small Python script that converts it to any format you like.

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From amk@magnet.com  Wed Nov 12 17:13:35 1997
From: amk@magnet.com (Andrew Kuchling)
Date: Wed, 12 Nov 1997 12:13:35 -0500 (EST)
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <3469D7C5.F90F32F9@technologist.com> (message from Paul Prescod
 on Wed, 12 Nov 1997 11:22:30 -0500)
Message-ID: <199711121713.MAA28481@lemur.magnet.com>

Paul Prescod <papresco@technologist.com> wrote:
> * The title of the thread is "notes on the tutorial's markup" -- isn't
>the tutorial then the task at hand? I'm having trouble reconstructing
>the thread. I thought that was proposed as a test case by Andrew
>Kuchling. 

	The tutorial, "Extending & Embedding", and API documents are
simpler than the LibRef, because they just need some simple
typesetting capability: chapters, sections, and code.  They're a lot
simpler than the library reference, so I chickened out from
considering the latter.

	For an example of the problems to be confronted in the LibRef,
look at the documentation for SocketServer.py, at
http://grail.cnri.reston.va.us/python/1.5a4/lib/node176.html; note the
several ad hoc sections for instance variables, class variables,
external methods, overridable internal methods, and utility methods.
The LaTeX setup, as it currently stands, can't document a complex
framework well; it's more suited to modules that are simply
collections of functions with an object or two.  (Even that's getting
tough; there's no typographical indication for default arguments or
keyword arguments.)  

	This can probably be solved by thinking up a typographical
notation for these features and adding LaTeX macros for them.  Perhaps
that band-aid solution would be good enough, though I don't know where
that leaves people who want to produce WinHelp, or OS/2 help, or
WhateverHelp files.

> I also thought that Andrew was the tutorial maintainer (from 
>some old messages in the thread). You can see how I came to the
>conclusion that he had the (moral) authority to move it to SGML.

	No, and I'm sorry if I gave that impression.  While I'm
tweaking the tutorial for 1.5, I haven't taken it over permanently.

> * Does it makes sense to think about the library reference in
>isolation? Do we really plan to keep the two docs in different formats

	The language reference is already in FrameMaker, but document
formats shouldn't be multiplied.  If GvR rules SGML/XML out, that's
that, and we have to consider the other options: just add some LaTeX
macros, or use TIM (which is built on top of Texinfo, a format which I
quite like), or invent some pod-like format.


	Andrew Kuchling
	amk@magnet.com
	http://starship.skyport.net/crew/amk/

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From klm@python.org  Wed Nov 12 17:27:54 1997
From: klm@python.org (Ken Manheimer)
Date: Wed, 12 Nov 1997 12:27:54 -0500 (EST)
Subject: [DOC-SIG] [XML] Notes on the Tutorial's markup
In-Reply-To: <199711111835.NAA02624@lemur.magnet.com>
Message-ID: <Pine.GSO.3.96.971112122140.9046K-100000@glyph.cnri.reston.va.us>

[Sorry - python.org sendmail got wedged, and reissued copies of one of
andrew kuchling's messages, having this message's subject line.  It's
possible a few other messages got the same treatment, but it doesn't
look that way.  In any case, sorry about the noise...]

Ken Manheimer		klm@cnri.reston.va.us	    703 620-8990 x268
	    (orporation for National Research |nitiatives

		   # Thanks for joining the PSA!  #
		    # http://www.python.org/psa/ #


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Wed Nov 12 18:08:31 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 12 Nov 1997 13:08:31 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
References: <346943C3.91CCF8FC@technologist.com> <199711121441.JAA00616@eric.CNRI.Reston.Va.US>
 <3469D7C5.F90F32F9@technologist.com> <199711121722.MAA01255@eric.CNRI.Reston.Va.US>
Message-ID: <3469F09F.8DCBB65A@technologist.com>

Guido van Rossum wrote:
> I think that SGML is not fit to be typed by humans 

Hundreds of thousands of HTML page authors would be surprised to hear
you say that!

> (SGML was designed to be typed by humans in the age of punched cards.)  

SGML was standardized in *1986*. This is around the same time Sun
Microsystems was becoming a Very Large Company and Bjarne Stroustrop was
starting to flog C++. GML, the older precursor to SGML, used a totally
different syntax and even then, I don't think anyone ever used punched
cards with it. Mainframe terminals, yes. Punched cards, no.

Furthermore, SGML was very carefully designed to be typeable:

"The markup rigorously expresses the hierarchy by identifying the
beginning and end of each element in classical left list order. No
additional information is needed to inerpret the structure, and it would
be possible to implement support by the simple scheme of macro
invocation discussed earlier. The price of this simplicity, though, is
that an end-tag must be present for every element.

The price would be totally unacceptable had the user to enter all of the
tags himself. He knows that the start of a paragraph, for example,
terminates the previous one, so he would be reluctant to go to the
trouble and expense of entering an explicit end-tag for every single
paragraph just to share his knowledge with the system....With SGML,
however, it is possible to omit much markup....After using tag
minimization there has been a 40% reduction in markup, since the
end-tags for three of the elements are no longer needed.

The document type definition enables SGML to minimize the user's text
entry effort without reliance on a 'smart' editing program or word
processor. This maximizes the portability of the document because it can
be understood and revised by humans using any of the millions of
existing 'dumb' keyboards."

> Latex has the same problem
> (especially the underscore is painful).  I think something else should
> be used that can be converted to SGML (or XML for all I care).  TIM,
> which has only one magic character (@, which isn't used in Python)
> fits the bill -- it did one or two years when I looked into it, and
> it's only because of inertia (and a lot of other things that needed to
> happen sooner) that I haven't started using it.
>...
> I just don't like the fact that SGML makes characters that occur
> frequently in Python source code like "<" and "/" special.  

SGML has two magic characters. "<" and "&". "/" is only magical when it
is either in a tag, or is used with that special tag <emph/minimization/
that I showed you. If you want to put a slash, you just don't use that
minimization: <code>a/b=c</>.

As far as the other two special chars: SGML has three solutions to
putting Python code in a document, depending on your needs You can make
an element with CDATA declared content like this:

<eg>
c=a<<b/d
</eg>

The only character sting the contents of the EG cannot contain is "</".
If you need to include that character string, then you must do this:

<eg>
<![CDATA[
 a = j</5
]]>
</eg>
 
I can't think of a context where you would need that in Python, though.
Finally you can use entities:

This is some inline Python code <CODE/a=b&gt;c/.

Note also that SGML allows you to change *all of* the delimiter
characters though that is a fairly drastic step (and I wouldn't usually
advise it).

> Yes, I think the library reference is a separate project from the
> tutorial.  I am planning to do the tutorial in FrameMaker because it
> gives me as an author the best user interface for editing and the most
> freedom to create nice layout, and because it is essentially a
> one-author document it's no problem that not everybody can afford
> FrameMaker (as long as I can generate HTML and PostScript, which I can
> -- and there's even a version of Frame that can generate SGML although
> I don't have it).  (Now that I've got a PC at home I may switch to MS
> Word too -- that's surely democratic :-)

There isn't a version of Frame that can generate SGML. There is a
version of Frame that can edit SGML. There is a subtle but important
difference. Once you start out in Frame *not* using Frame+SGML, there is
nothing that constrains you to using structures that have meaning in a
particular SGML DTD (including HTML). FrameMaker cannot thus imply
structure from your "nice layout".

I will be very curious to see how good the HTML output is, and how much
"freedom" Frame offers you without totally destroying the consistency of
your HTML output. If you use hot-pink on green to represent important
notes, how is that going to be represented in a document that makes
sense to Lynx? How will you know which FrameMaker features translate
properly into HTML and which do not? Trial and error?

Personally, I think you would be better off using Frame+SGML right off
the bat, because then you will have total control over the output, but I
will be curious to see what you get out of ordinary FrameMaker anyhow --
converting arbitrary MIF to HTML is sort of an AI project and I like to
see what's the state of the art in AI. :)
 
> Also the
> fact that SGML parsers that support the full syntax are either costly
> in money or in resources (few sites that I know have an SGML parser
> installed already; sgmllib.py doesn't cut it).  

I don't see how James Clark's SGML parser is expensive in either money
or resources. On Windows, it takes up about 3.5 MB with the Jade SGML
conversion tool, the OLE automation library, and 3 other related SGML
tools. It is trivial to install and compile. It is actually distributed
fairly widely as an HTML checker.

> TIM, on the other
> hand, was *designed* to be trivial to parse, so you can quickly write
> a small Python script that converts it to any format you like.

Great. But using Jade, I can convert to 3 formats (RTF, MIF, TeX,
PostScript) with a single "small script" (not Python, alas). If I do
want to use Python, my script will be just as simple, but will depend on
nsgmls. And as more formats arise, they will similarly be supported. But
more important -- I shouldn't have to write the small script at all,
because it is has already been written.

How does TIM enforce the proper organization of document macros. Will it
complain if I put an @messageDef{} inside of an @argDef{}? Doesn't this
type of enforcement seem useful in a situation where many people around
the world are working on a document?

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From jim.fulton@digicool.com  Wed Nov 12 18:26:25 1997
From: jim.fulton@digicool.com (Jim Fulton)
Date: Wed, 12 Nov 1997 13:26:25 -0500
Subject: [DOC-SIG] Documentation formats
References: <346943C3.91CCF8FC@technologist.com> <199711121441.JAA00616@eric.CNRI.Reston.Va.US>
 <3469D7C5.F90F32F9@technologist.com> <199711121722.MAA01255@eric.CNRI.Reston.Va.US> <3469F09F.8DCBB65A@technologist.com>
Message-ID: <3469F4D1.5B5B@digicool.com>

Paul Prescod wrote:
> 
> Guido van Rossum wrote:
> > I think that SGML is not fit to be typed by humans

I agree alot.

I also think TeX and it's variants are not fit to be typed
by humans.

> Hundreds of thousands of HTML page authors would be surprised to hear
> you say that!

I wouldn't be surprized.  That doesn't make Guido's statement incorrect.
 
I'm putting my $0.02 in response to this message for no particular
reason.  It seemed like as good a place as any. :-)

 - With regard to doc strings, I think it is *very* important
   that they be very readable in raw form.  I think that one can
   go a long way with tools like structured text to produce reasonably
   rich output while retaining readability of source text.

   This was discussed at length in the early days of the DOC sig. 
   I'm sure the archives contain this discussion.
   
 - With regard to Python manuals and documentation not generated from 
   docstrings, I have another suggestion.  I don't know for sure that
   this suggestion is viable, or if someone has suggested this before.

   IMO in an ideal world, people would author documentation in a modern
   word processor like Frame or Word and people could share
   documentation files using some neutral format.  I don't know if
   such a neutral format exists, although I seem to remember that at
   one point, Frame had a tool for working with SGML in Framemaker.
   I don't know what happened with that tool, but if it is still around, 
   maybe people who hate editing SGML could use Frame or some other
format
   that supports SGML and other folks could hack SGML or use tools that
   convert between their favorite editing environment and SGML.

Jim

-- 
Jim Fulton            jim@digicool.com
Technical Director    540.371.6909                Python Powered!
Digital Creations     http://www.digicool.com/    http://www.python.org/

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From amk@magnet.com  Wed Nov 12 19:06:31 1997
From: amk@magnet.com (Andrew Kuchling)
Date: Wed, 12 Nov 1997 14:06:31 -0500 (EST)
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711121722.MAA01255@eric.CNRI.Reston.Va.US> (message from
 Guido van Rossum on Wed, 12 Nov 1997 12:22:11 -0500)
Message-ID: <199711121906.OAA02865@lemur.magnet.com>

Guido van Rossum <guido@CNRI.Reston.Va.US> wrote:
>be used that can be converted to SGML (or XML for all I care).  TIM,
>which has only one magic character (@, which isn't used in Python)

	{ } are also special, aren't they?  (TIM is built on top of
Texinfo, which provides output in the form of GNU Info format and .dvi
files; there are also texi2nroff and texi2html converters.)
	
>fits the bill -- it did one or two years when I looked into it, and
>it's only because of inertia (and a lot of other things that needed to
>happen sooner) that I haven't started using it.

	Aha!  What prevented you from moving to TIM?  Just the work
required to convert everything, or are there pieces still missing?
For the record, I also really like TIM; it's simple enough to be
easily processed, but you can escape into TeX if required.  TIM, via
Texinfo, provides functions for defining class methods and the like:

@defmethod @r{hashing objects} copy ()
Return a separate copy of this hashing object.  An @code{update} to this
  copy won't affect the original object.
@end defmethod


	Andrew Kuchling
	amk@magnet.com
	http://starship.skyport.net/crew/amk/


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From da@skivs.ski.org  Wed Nov 12 19:31:16 1997
From: da@skivs.ski.org (David Ascher)
Date: Wed, 12 Nov 1997 11:31:16 -0800 (PST)
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711121906.OAA02865@lemur.magnet.com>
Message-ID: <Pine.SUN.3.96.971112111954.1010A-100000@skivs.ski.org>

> 	Aha!  What prevented you from moving to TIM?  Just the work
> required to convert everything, or are there pieces still missing?
> For the record, I also really like TIM; it's simple enough to be
> easily processed, but you can escape into TeX if required.  TIM, via
> Texinfo, provides functions for defining class methods and the like:
>
> @defmethod @r{hashing objects} copy ()
> Return a separate copy of this hashing object.  An @code{update} to this
>   copy won't affect the original object.
> @end defmethod

For the record, while we're at it -- TIM is what I used for the Numeric
Tutorial (which will be updated, promise, someday).  It worked pretty
well.  It's not all that different from LaTeX as far as the user's
experience is concerned, except that it's a better match to Python (e.g.
underscores, etc.).

I didn't try hard to get all the references, indexing, etc., right -- I
certainly didn't try to get the @node system working well, since I don't
think "real" info use was going to happen.  I'm not sure how easy it would
be to make it do all we'd need.  It'd be good to investigate how to
generate WinHelp files (better than the current solution, which creates a
lousy table of contents). 

Overall, once I got it working (I had some configuration problems Bill
helped me with), it worked well.  The Numeric tutorial is available in
HTML, tex/dvi/postscript, as well as in a very readable text only form,
which I think is quite pleasant.  See

	http://starship.skyport.net/~da/Python/Numeric

for the "published versions" and

	http://starship.skyport.net/~da/Python/Numeric/array.tim

for one of the TIM source files, if you want a look.

--david


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From mclay@smtp.erols.com  Wed Nov 12 19:31:50 1997
From: mclay@smtp.erols.com (Michael McLay)
Date: Wed, 12 Nov 1997 14:31:50 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711121713.MAA28481@lemur.magnet.com>
References: <3469D7C5.F90F32F9@technologist.com>
 <199711121713.MAA28481@lemur.magnet.com>
Message-ID: <199711121931.OAA24333@fermi.eeel.nist.gov>


This bounced on my first try.  Sorry if it is a repeat.

Several additional messges on the subject have arrive since I started
looking at TIM.  Seems like we need to define the requirements before
we can pick between Latex, TIM, XML, FRAME or any other approach to
generating documentation.  Is TIM too simple?  Is XML too new?  Is SGML 
too complex?  Is a proprietary tool detrimental to contributions of
documentation?  Don't all these questions require a set of requirments 
against which they can be evaluated?

Andrew Kuchling writes:
> 	The language reference is already in FrameMaker, but document
 > formats shouldn't be multiplied.  If GvR rules SGML/XML out, that's
 > that, and we have to consider the other options: just add some LaTeX
 > macros, or use TIM (which is built on top of Texinfo, a format which I
 > quite like), or invent some pod-like format.

Since the benevolent dictator and Bill Janssen suggest TIM then why
don't we take a closer look at it and discuss what switching to TIM
would fail to support.  In reviewing the TIM manual page at
ftp://ftp.parc.xerox.com/pub/ilu/2.0a11/manual-html/manual_21.html
several features make TIM look like a good option:


  1) Concise syntax that is easy to integrate with Python examples
  2) TIM works
  3) TIM was written in Python:-) (only about 820 lines of code)
  4) It looks like a markup that would be much easier to convert to
     XML than Latex.  (My guess is that XML will eventually become the 
     standard for WYSIWYG editors so the ugly tagging issue will go away.)
  5) Restricted set of tags, which makes it fairly easy to learn to use

Downside:

  1) Heavy dependance on external programs which may not be on every platform
	MAKEINFO = '/usr/bin/makeinfo'
	TEX = '/usr/bin/tex'
	TEXINDEX = '/usr/bin/texindex'
	DVIPS = '/usr/bin/dvips'

  2) May require some work to get the reference manual indexing
     working with the new tools.
  3) Restricted set of tags, which makes it fairly hard to extend
     (except by using macros.)
  4) Mixes macro language with markup.  Is this really a problem?
     The TIM macros seem to primarily be used to declare context names
     which are then translatable to generic typographic codes.  This 
     should make it easier to move the tagged text to meaningful XML
tags.

Defining Domain-specific markup commands isn't docuemnted.  The
documentation says it is [TBD].  I grepped for usage and found the
following.  Looks pretty simple to use.
   ilu-macros.tim:@timmacro var                   code
   ilu-macros.tim:@timmacro metavar               var
   ilu-macros.tim:@timmacro C                     code
   ilu-macros.tim:@timmacro C++                   code
   ilu-macros.tim:@timmacro command               code
   ilu-macros.tim:@timmacro constant              code
   ilu-macros.tim:@timmacro codeexample           example
   ilu-macros.tim:@timmacro dfn                   i
   ilu-macros.tim:@timmacro cl                    code
   ilu-macros.tim:@timmacro class                 code
   ilu-macros.tim:@timmacro exception             code
   ilu-macros.tim:@timmacro fn                    code
   ilu-macros.tim:@timmacro interface             code
   ilu-macros.tim:@timmacro java                  code
   ilu-macros.tim:@timmacro isl                   code
   ilu-macros.tim:@timmacro kwd                   code
   ilu-macros.tim:@timmacro language              asis
   ilu-macros.tim:@timmacro m3                    code
   ilu-macros.tim:@timmacro macro                 code
   ilu-macros.tim:@timmacro message               code

Would TIM make a good starting point?  If so, should it be modernized
to use re instead of regex and then developed into a more
full-featured markup language for Python? 

An example of a TIM file is attached.  The example is a snippet from
the ILU Python Tutorial.  Looks pretty readable to me.

@setfilename ilu-tutorial.info
@settitle Using ILU with Python:  A Tutorial
@finalout
@c $Id: tutpython.tim,v 1.8 1996/03/19 04:11:10 janssen Exp $
@ifclear largerdoc
@titlepage
@title Using ILU with Python:  A Tutorial
@author Bill Janssen @code{<janssen@@parc.xerox.com>}
@sp
Formatted @today{}.
@sp
Copyright @copyright{} 1995 Xerox Corporation@*
All Rights Reserved.
@end titlepage
@ifinfo
@node Top, ,(dir),(dir)
@top Using ILU with Python
@end ifinfo
@end ifclear

@syncodeindex pg cp

@section Introduction

This tutorial will show how to use the @system{ILU} system with the programming language @language{Python},
both as a way of developing software libraries, and as a way
of building distributed systems.
In an extended example, we'll build an @system{ILU} module that implements a simple
four-function calculator, capable of addition, subtraction,
multiplication, and division.  It will signal an error if
the user attempts to divide by zero.  The example demonstrates
how to specify the interface for the module; how to implement the module in @language{Python};
how to use that implementation as a simple library; how to provide the module as a remote service;
how to write a client of that remote service; and how to use subtyping to extend an object type
and provide different versions of a module.  We'll also demonstrate how to use @language{OMG IDL}
with @system{ILU}, and discuss the notion of network garbage collection.

Each of the programs and files referenced in this tutorial is available
as a complete program
in a separate appendix to this document; parts of programs are quoted
in the text of the tutorial.

@page
@section Specifying the Interface

Our first task is to specify more exactly what it is we're trying
to provide.  A typical four-function calculator lets a user enter
a value, then press an operation key, either +, -, /, or *,
then enter another number, then press = to actually have
the operation happen.  There's usually a CLEAR button to press
to reset the state of the calculator.  We want to provide something like
that.

We'll recast this a bit more formally as the @dfn{interface}
of our module; that is, the way the module will
appear to clients of its functionality.  The interface
typically describes a number of function calls which can be
made into the module, listing their arguments and return types,
and describing their effects.  @system{ILU} uses @dfn{object-oriented}
interfaces, in which the functions in the interface are grouped
into sets, each of which applies to an @dfn{object type}.  These
functions are called @dfn{methods}.

For example, we can think of the calculator as an object type,
with several methods:  Add, Subtract, Multiply, Divide, Clear, etc.
@system{ILU} provides a standard notation to write this down with,
called @dfn{ISL} (which stands for ``Interface Specification Language'').
@language{ISL} is a declarative language which can be processed
by computer programs.  It allows you to define object types (with methods),
other non-object types, exceptions, and constants.

The interface for our calculator would be written in ISL as:
@codeexample
INTERFACE Tutorial;

EXCEPTION DivideByZero;

TYPE Calculator = OBJECT
  METHODS
    SetValue (v : REAL),
    GetValue () : REAL,
    Add (v : REAL),
    Subtract (v : REAL),
    Multiply (v : REAL),
    Divide (v : REAL) RAISES DivideByZero END
  END;
@end codeexample
This defines an interface @isl{Tutorial}, an exception @isl{DivideByZero},
and an object type @isl{Calculator}.  Let's consider these one by one.

The interface, @isl{Tutorial}, is a way of grouping a number of type
and exception definitions.  This is important to prevent collisions
between names defined by one group and names defined by another group.
For example, suppose two different people had defined two different
object types, with different methods, but both called @isl{Calculator}!
It would be impossible to tell which calculator was meant.  By
defining the @isl{Calculator} object type within the scope of the
@isl{Tutorial} interface, this confusion can be avoided.

The exception, @isl{DivideByZero}, is a formal name for a particular
kind of error, division by zero.  Exceptions in @system{ILU} can specify
an @dfn{exception-value type}, as well, which means that real errors
of that kind have a value of the exception-value type associated with them.
This allows the error to contain useful information about why it might
have come about.  However, @isl{DivideByZero} is a simple exception,
and has no exception-value type defined.  We should note that the full
name of this exception is @isl{Tutorial.DivideByZero}, but for this
tutorial we'll simply call our exceptions and types by their short name.

The object type, @isl{Calculator} (again, really @isl{Tutorial.Calculator}),
is a set of six methods.  Two of those methods, @isl{SetValue} and
@isl{GetValue}, allow us to enter a number into the calculator object,
and ``read'' the number.  Note that @isl{SetValue} takes a single
argument, @metavar{v}, of type @type{REAL}.  @type{REAL} is a
built-in @language{ISL} type, denoting a 64-bit floating point number.
Built-in @language{ISL} types are things like @type{INTEGER} (32-bit
signed integer), @type{BYTE} (8-bit unsigned byte), and @type{CHARACTER}
(16-bit Unicode character).  Other more complicated types are
built up from these simple types using @language{ISL} @dfn{type constructors},
such as @isl{SEQUENCE OF}, @isl{RECORD}, or @isl{ARRAY OF}.

Note also that @isl{SetValue} does not return a value,
and neither do @isl{Add}, @isl{Subtract}, @isl{Multiply},
or @isl{Divide}.  Rather,
when you want to see what the current value of the calculator
is, you must call @isl{GetValue}, a method which has no arguments,
but which returns a @type{REAL} value, which is the value of the
calculator object.  This is an arbitrary decision on our part;
we could have written the interface differently, say as
@codeexample
TYPE NotOurCalculator = OBJECT
  METHODS
    SetValue () : REAL,
    Add (v : REAL) : REAL,
    Subtract (v : REAL) : REAL,
    Multiply (v : REAL) : REAL,
    Divide (v : REAL) : REAL RAISES DivideByZero END
  END;
@end codeexample
@noindent
-- but we didn't.

Our list of methods on @type{Calculator} is bracketed by the two
keywords @isl{METHODS} and @isl{END}, and the elements are separated
from each other by commas.  This is pretty standard in @language{ISL}:
elements of a list are separated by commas; the keyword @isl{END}
is used when an explicit list-end marker is needed (but not when it's
not necessary, as in the list of arguments to a method); the list often
begins with some keyword, like @isl{METHODS}.
The @dfn{raises clause} (the list of exceptions which a method
might raise) of the method @isl{Divide} provides another example
of a list, this time with only one member, introduced by the keyword
@isl{RAISES}.

Another standard
feature of @language{ISL} is separating a name, like @isl{v},
from a type, like @type{REAL}, with a colon character.  For example,
constants are defined with syntax like
@codeexample
CONSTANT Zero : INTEGER = 0;
@end codeexample
@noindent
Definitions, of interface, types, constants, and exceptions, are
terminated with a semicolon.

We should expand our interface a bit by adding more documentation
on what our methods actually do.  We can do this with the @dfn{docstring}
feature of @language{ISL}, which allows the user to add arbitrary
text to object type definitions and method definitions.  Using
this, we can write
@codeexample
INTERFACE Tutorial;

EXCEPTION DivideByZero
  "this error is signalled if the client of the Calculator calls
the Divide method with a value of 0";

TYPE Calculator = OBJECT
  COLLECTIBLE
  DOCUMENTATION "4-function calculator"
  METHODS
    SetValue (v : REAL) "Set the value of the calculator to `v'",
    GetValue () : REAL  "Return the value of the calculator",
    Add (v : REAL)      "Adds `v' to the calculator's value",
    Subtract (v : REAL) "Subtracts `v' from the calculator's value",
    Multiply (v : REAL) "Multiplies the calculator's value by `v'",
    Divide (v : REAL) RAISES DivideByZero END
      "Divides the calculator's value by `v'"
  END;
@end codeexample
@noindent
Note that we can use the @isl{DOCUMENTATION} keyword on object types
to add documentation about the object type, and can simply add documentation
strings to the end of exception and method definitions.  These docstrings
are passed on to the @language{Python} docstring system, so that they are available
at runtime from @language{Python}.  Documentation
strings cannot currently be used for non-object types.

@system{ILU} provides a program, @program{islscan}, which can be used
to check the syntax of an @language{ISL} specification.  @program{islscan}
parses the specification and summarizes it to standard output:
@transcript
% @userinput{islscan Tutorial.isl}
Interface "Tutorial", imports "ilu"
  @{defined on line 1
   of file /tmp/tutorial/Tutorial.isl (Fri Jan 27 09:41:12 1995)@}

Types:
  real                       @{<built-in>, referenced on 10 11 12 13 14 15@}

Classes:
  Calculator                 @{defined on line 17@}
    methods:
      SetValue (v : real);                          @{defined 10, id 1@}
        "Set the value of the calculator to `v'"
      GetValue () : real;                           @{defined 11, id 2@}
        "Return the value of the calculator"
      Add (v : real);                               @{defined 12, id 3@}
        "Adds `v' to the calculator's value"
      Subtract (v : real);                          @{defined 13, id 4@}
        "Subtracts `v' from the calculator's value"
      Multiply (v : real);                          @{defined 14, id 5@}
        "Multiplies the calculator's value by `v'"
      Divide (v : real) @{DivideByZero@};             @{defined 16, id 6@}
        "Divides the calculator's value by `v'"
    documentation:
      "4-function calculator"
    unique id:  ilu:cigqcW09P1FF98gYVOhf5XxGf15

Exceptions:
  DivideByZero               @{defined on line 5, refs 15@}
%
@end transcript

@noindent
@program{islscan} simply lists the types defined in the interface, separating
out object types (which it calls ``classes''), the exceptions, and
the constants.  Note that for the @type{Calculator} object type,
it also lists something called its @dfn{unique id}.  This is a 160-bit
number (expressed in base 64) that @system{ILU} assigns automatically
to every type, as a way of distinguishing them.  While
it might interesting to know that it exists (:-),
the @system{ILU} user never has know what it is; @program{islscan}
supplies it for the convenience of the @system{ILU} implementors, who
sometimes do have to know it.
@page
@section Implementing the True Module

After we've defined an interface, we then need to supply an implementation
of our module.  Implementations can be done in any language supported by
@system{ILU}.  Which language you choose often depends on what sort
of operations have to be performed in implementing the specific functions
of the module.  Different languages have specific advantages and disadvantages
in different areas.  Another consideration is whether you wish to use the
implementation mainly as a library, in which case it should probably be done
in the same language as the rest of your applications, or mainly as
a remote service, in which case the specific implementation language
is less important.

We'll demonstrate an implementation of the @type{Calculator}
object type in @system{Python}, which is one of the most capable
of all the @system{ILU}-supported languages.  This is just a matter
of defining a @language{Python} class, corresponding to the @type{Tutorial.Calculator} type.  Before we do that,
though, we'll explain how the names and signatures of the @language{Python} functions
are arrived at.

@subsection What the Interface Looks Like in Python

For every programming language
supported by @system{ILU}, there is a standard @dfn{mapping} defined
from @language{ISL} to that programming language.  This mapping defines
what @language{ISL} type names, exception names, method names,
and so on look like
in that programming language.

The mapping for @language{Python} is straightforward.  For type names,
such as @isl{Tutorial.Calculator}, the @language{Python} name
of the @language{ISL} type @isl{Interface.Name}
is @Python{Interface.Name}, with any hyphens replaced by underscores.  That is, the name of the interface in @language{ISL}
becomes the name of the module in @language{Python}.
So the name of our @type{Calculator} type in @language{Python}
would be @Python{Tutorial.Calculator}, which is really the name of a @language{Python} class.

The @language{Python} mapping for a method name such as @isl{SetValue}
is the method name, with any hyphens replaced by underscores.
The return type of this @language{Python} method is whatever is specified
in the @language{ISL} specification for the method, or @Python{None} if
no type is specified.  The arguments for the @language{Python} method are the
same as specified in the @language{ISL}; their types are the
@language{Python} types corresponding to the @language{ISL} types, @emph{except}
that one extra argument is added to the beginning of each @language{Python}
version of an @language{ISL} method; it is an @dfn{instance} of the object type
on which the method is defined.  An instance is simply a value of that
type.  Thus the @language{Python} method corresponding
to our @language{ISL} @isl{SetValue} would have the prototype signature
@codeexample
   def SetValue (self, v):
@end codeexample
@noindent

Similarly, the signatures for the other methods, in @language{Python}, are
@codeexample
   def GetValue (self):

   def Add (self, v):

   def Subtract (self, v):

   def Multiply (self, v):

   def Divide (self, v):
@end codeexample
@noindent
Note that even though the @isl{Divide} method can raise an exception,
the signature looks like those of the other methods.  This is because
the normal @language{Python} exception signalling mechanism is used to
signal exceptions back to the caller.
The mapping of exception names is similar to the mapping used for types.
So the exception @isl{Tutorial.DivideByZero}
would also have the name @Python{Tutorial.DivideByZero}, in @language{Python}.

One way to see what all the @language{Python} names for an interface
look like is to run the program @program{python-stubber}.  This program
reads an @language{ISL} file, and generates the necessary @language{Python}
code to support that interface in @language{Python}.  One of the files
generated is @file{@metavar{Interface}.py}, which contains the definitions
of all the @language{Python} types for that interface.
@transcript
% @userinput{python-stubber Tutorial.isl}
client stubs for interface "Tutorial" to Tutorial.py ...
server stubs for interface "Tutorial" to Tutorial__skel.py ...
%
@end transcript
@page
@subsection Building the Implementation

To provide an implementation of our interface, we @dfn{subclass} the
generated @language{Python} class for our @class{Calculator} class:

@codeexample
# CalculatorImpl.py

import Tutorial, Tutorial__skel

class Calculator (Tutorial__skel.Calculator):

        def __init__ (self):
                self.the_value = 0.0

        def SetValue (self, v):
                self.the_value = v

        def GetValue (self):
                return self.the_value

        def Add (self, v):
                self.the_value = self.the_value + v

        def Subtract (self, v):
                self.the_value = self.the_value - v

        def Multiply (self, v):
                self.the_value = self.the_value * v

        def Divide (self, v):
                try:
                        self.the_value = self.the_value / v
                except ZeroDivisionError:
                        raise Tutorial.DivideByZero
@end codeexample

Each instance of a @Python{CalculatorImpl.Calculator} object
inherits from @Python{Tutorial__skel.Calculator}, which in turn
inherits from @Python{Tutorial.Calculator}.  Each has an instance
variable called @Python{the_value}, which maintains a running total
of the `accumulator' for that instance.  We can create an instance
of a @isl{Tutorial.Calculator} object by simply calling @Python{CalculatorImpl.Calculator()}.

@page
So, a very simple program to use the @isl{Tutorial} module might be
the following:

@codeexample
# simple1.py, a simple program that demonstrates the use of the
#  Tutorial true module as a library.
#
# run this with the command "python simple1.py NUMBER [NUMBER...]"
#

import Tutorial, CalculatorImpl, string, sys

# A simple program:
#  1)  make an instance of Tutorial.Calculator
#  2)  add all the arguments by invoking the Add method
#  3)  print the resultant value.

def main (argv):

        c = CalculatorImpl.Calculator()
        if not c:
                error("Couldn't create calculator")

        # clear the calculator before using it

        c.SetValue (0.0)

        # now loop over the arguments, adding each in turn */

        for arg in argv[1:]:
                v = string.atof(arg)
                c.Add (v)

        # and print the result

        print "the sum is", c.GetValue()
        sys.exit(0)

main(sys.argv)
@end codeexample

@noindent
This program would be compiled and run as follows:
@transcript
% @userinput{python simple1.py 34.9 45.23111 12}
the sum is 92.13111
%
@end transcript

@noindent
This is a completely self-contained use of the @isl{Tutorial}
implementation; when a method is called, it is the true method
that is invoked.  The use of @system{ILU} in this program adds
some overhead in terms of included code, but has almost
the same performance as a version of this program that does not
use @system{ILU}.


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@calum.csclub.uwaterloo.ca  Wed Nov 12 19:40:28 1997
From: papresco@calum.csclub.uwaterloo.ca (Paul Prescod)
Date: Wed, 12 Nov 1997 14:40:28 -0500 (EST)
Subject: [DOC-SIG] Documentation formats
In-Reply-To: <3469F4D1.5B5B@digicool.com> from "Jim Fulton" at Nov 12, 97 01:26:25 pm
Message-ID: <199711121940.OAA12494@calum.csclub.uwaterloo.ca>

>    IMO in an ideal world, people would author documentation in a modern
>    word processor like Frame or Word and people could share
>    documentation files using some neutral format.  I don't know if
>    such a neutral format exists, although I seem to remember that at
>    one point, Frame had a tool for working with SGML in Framemaker.
>    I don't know what happened with that tool, but if it is still around, 
>    maybe people who hate editing SGML could use Frame or some other
> format
>    that supports SGML and other folks could hack SGML or use tools that
>    convert between their favorite editing environment and SGML.

That's right.

There are more tools for allowing you to create SGML documents without
typeing tags than there are for TeX, LaTeX and TIM. For a while I worked
in the source code bowels of one (extending it, not creating it). And
because SGML is an international standard, there are always more tools being
created that allow you do so.

Still, in the interest in truth in advertising, I should mention that in
my opinion, the idea that you will one day just create documents in a WYSIWYG
editor without worrying about the structure is a fantasy. Computers cannot
infer structure. The user interface must reflect the structure that you 
want in your SGML files. If you want elements like "class", "method", 
and "hyperlink", then you must be aware of their availability.

 Paul Prescod


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From jim.fulton@digicool.com  Wed Nov 12 20:02:32 1997
From: jim.fulton@digicool.com (Jim Fulton)
Date: Wed, 12 Nov 1997 15:02:32 -0500
Subject: [DOC-SIG] Documentation formats
References: <199711121940.OAA12494@calum.csclub.uwaterloo.ca>
Message-ID: <346A0B58.320E@digicool.com>

Paul Prescod wrote:
>
> Still, in the interest in truth in advertising, I should mention that in
> my opinion, the idea that you will one day just create documents in a WYSIWYG
> editor without worrying about the structure is a fantasy. Computers cannot
> infer structure. The user interface must reflect the structure that you
> want in your SGML files. If you want elements like "class", "method",
> and "hyperlink", then you must be aware of their availability.

Both Frame and Word let you create documents based on structural
elements.
So you can define and preserve structure while working in a WYSIWYG
environment.

-- 
Jim Fulton            jim@digicool.com
Technical Director    540.371.6909                Python Powered!
Digital Creations     http://www.digicool.com/    http://www.python.org/

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Wed Nov 12 22:27:19 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 12 Nov 1997 17:27:19 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
References: <3469D7C5.F90F32F9@technologist.com>
 <199711121713.MAA28481@lemur.magnet.com> <199711121931.OAA24333@fermi.eeel.nist.gov>
Message-ID: <346A2D47.718C60DC@technologist.com>

Michael McLay wrote:
> Downside:
> 
>   1) Heavy dependance on external programs which may not be on every platform
>         MAKEINFO = '/usr/bin/makeinfo'
>         TEX = '/usr/bin/tex'
>         TEXINDEX = '/usr/bin/texindex'
>         DVIPS = '/usr/bin/dvips'
> 
>   2) May require some work to get the reference manual indexing
>      working with the new tools.
>   3) Restricted set of tags, which makes it fairly hard to extend
>      (except by using macros.)
>   4) Mixes macro language with markup.  Is this really a problem?
>      The TIM macros seem to primarily be used to declare context names
>      which are then translatable to generic typographic codes.  This
>      should make it easier to move the tagged text to meaningful XML
> tags.

    5) Mixes formatting ("@page, @noindent") with structure
("@codeexample")
    6) Does not seem to allow restrictions on macro roles to be
expressed
    7) There are no editors that will help you to create TIM documents
correctly (and will probably never be)
    8) We will have to develop new output formats from scratch whereas
with SGML/Jade they are reused across an industry.
    9) FrameMaker cannot import or export TIM, so we will have written
off a great WYSIWYG typesetting tool.

More important, to me: using TIM would generally contribute to the
"multiplication of documentation formats". The rest of the software
industry is about to rally around SGML's XML incarnation. Bill Gates
says its the greatest thing since sliced bread. Marc Andreeson agrees
with him (first time in history!). Adobe is also on board. I know many
cygnus people are interested in SGML and are working to move cygnus
tools over.

XML is not exactly what we want, but SGML is at least still in the same
ballpark -- the same parsers and other tools will usually support
either. The same DTDs can support both. Python software that we develop
to support SGML will be used across many different projects. Whereas
software to support TIM will probably be used for the library reference,
ILU and nothing else. Great SGML support in Python would actually
attract new users. I know I've turned some people onto Python via SGML
and so have others...there are even books on SGML that discuss Python.
One day there may be a book on SGML processing in Python.

In short, I think that subscribing to standards is the Right Thing
unless they are flawed. IMO, nobody has yet made a serious case against
SGML in this regard. SGML addressed everything that everyone has
complained about ("too verbose", "no tools", "too many delimiter
chars").

If we want, we can also define an SGML subset simple enough to be parsed
with Python tools alone. We will just take XML and add back in the
shortcuts we like from SGML and fix up sgmllib.py to support them. There
is no reason that SGML should be harder to parse than TIM if we restrict
ourselves to a subset. Really the only thing that's very hard about
generic SGML is automatic tag omission. If we forgo that (as TIM does)
then SGML is not really hard to parse.

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Wed Nov 12 22:29:51 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 14:29:51 PST
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711121722.MAA01255@eric.CNRI.Reston.Va.US>
References: <346943C3.91CCF8FC@technologist.com> <199711121441.JAA00616@eric.CNRI.Reston.Va.US>
 <3469D7C5.F90F32F9@technologist.com>
 <199711121722.MAA01255@eric.CNRI.Reston.Va.US>
Message-ID: <woOWrTgB0KGW8pnPh7@holmes.parc.xerox.com>

Excerpts from ext.python: 12-Nov-97 Re: [DOC-SIG] Comparing SGM.. Guido
van Rossum@CNRI.Re (6473)

> TIM,
> which has only one magic character (@, which isn't used in Python)
> fits the bill -- it did one or two years when I looked into it, and
> it's only because of inertia (and a lot of other things that needed to
> happen sooner) that I haven't started using it.

Since then, I believe, the TIM front-end has been re-written in Python,
as well.

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Wed Nov 12 22:33:20 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 14:33:20 PST
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <3469F09F.8DCBB65A@technologist.com>
References: <346943C3.91CCF8FC@technologist.com> <199711121441.JAA00616@eric.CNRI.Reston.Va.US>
 <3469D7C5.F90F32F9@technologist.com> <199711121722.MAA01255@eric.CNRI.Reston.Va.US>
 <3469F09F.8DCBB65A@technologist.com>
Message-ID: <woOWukkB0KGW0pnQA1@holmes.parc.xerox.com>

Excerpts from ext.python: 12-Nov-97 Re: [DOC-SIG] Comparing SGM.. Paul
Prescod@technologis (6620*)

> How does TIM enforce the proper organization of document macros. Will it
> complain if I put an @messageDef{} inside of an @argDef{}? Doesn't this
> type of enforcement seem useful in a situation where many people around
> the world are working on a document?

It has no extra sanity checks for this; it uses whatever the Texinfo
checks are -- which isn't great, because Texinfo was originally a
collection of TeX hacks.  TIM is clearly not as powerful as some
frameworks could be that were built in XML; however, it seems to be
powerful enough for a broad range of documents.

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Wed Nov 12 22:35:36 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 14:35:36 PST
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <Pine.SUN.3.96.971112111954.1010A-100000@skivs.ski.org>
References: <Pine.SUN.3.96.971112111954.1010A-100000@skivs.ski.org>
Message-ID: <QoOWwsUB0KGW4pnQlG@holmes.parc.xerox.com>

Excerpts from ext.python: 12-Nov-97 Re: [DOC-SIG] Comparing SGM.. David
Ascher@skivs.ski.o (1841*)

> I certainly didn't try to get the @node system working well, since I don't
> think "real" info use was going to happen.

Yes; this is currently an unaddressed major pain inherited from Texinfo.
 I'm planning on (someday) adding automatic node generation based on
@section, etc.

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Wed Nov 12 22:37:42 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 14:37:42 PST
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <Pine.SUN.3.96.971112111954.1010A-100000@skivs.ski.org>
References: <Pine.SUN.3.96.971112111954.1010A-100000@skivs.ski.org>
Message-ID: <goOWyq0B0KGW8pnRFz@holmes.parc.xerox.com>

Interested parties might also look at
http://www.parc.xerox.com/http-ng/architectural-model.html, which is
auto-generated from TIM source and illustrates the use of pictures and
URLs in TIM.

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From fredrik@pythonware.com  Wed Nov 12 22:40:17 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 12 Nov 1997 23:40:17 +0100
Subject: [DOC-SIG] Comparing SGML DTDs
Message-ID: <9711122240.AA15058@arnold.image.ivab.se>

> There is no reason that SGML should be harder to parse than TIM if
> we restrict ourselves to a subset. Really the only thing that's very hard
> about generic SGML is automatic tag omission. If we forgo that (as
> TIM does) then SGML is not really hard to parse.

I'd say the most important issue here is whether it's hard to write
or not.  I don't think so, but I haven't digged into any serious DTD
yet...

Has anyone looked at RTF<->SGML conversion?  Guess that could
allow people to use Frame or Word  (FWIW, I'm writing my book
in Word, with an RTF template created by Frame, and the resulting
files are converted to SGML by the ORA wizards... don't ask me
how they do it, though).

Or is the Emacs SGML mode good enough?

(On the other hand, I'm sure I'll have to pay for "voting against" the
benevolent dictator...  and I've had enough flames in my mailbox
today ;-)

arrogantly-and-simple-minded-ly y'rs /F

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Wed Nov 12 22:44:19 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 14:44:19 PST
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711121931.OAA24333@fermi.eeel.nist.gov>
References: <3469D7C5.F90F32F9@technologist.com>
 <199711121713.MAA28481@lemur.magnet.com>
 <199711121931.OAA24333@fermi.eeel.nist.gov>
Message-ID: <AoOX53sB0KGWIpnRlo@holmes.parc.xerox.com>

Just a few notes...

Excerpts from ext.python: 12-Nov-97 Re: [DOC-SIG] Comparing SGM..
Michael McLay@smtp.erols (21835*)

>   3) TIM was written in Python:-) (only about 820 lines of code)

TIM itself is just a macro front end to Texinfo that provides generic
markup, picture support, and URL support.  That's what's written in
Python.

>   4) It looks like a markup that would be much easier to convert to
>      XML than Latex.  (My guess is that XML will eventually become the 
>      standard for WYSIWYG editors so the ugly tagging issue will go away.)

Yes.  The current Perl script timdif2html provides HTML output; a
variant of that, or another Python script, would be used to produce XML.

>   1) Heavy dependance on external programs which may not be on every platform
> 	MAKEINFO = '/usr/bin/makeinfo'
> 	TEX = '/usr/bin/tex'
> 	TEXINDEX = '/usr/bin/texindex'
> 	DVIPS = '/usr/bin/dvips'

`makeinfo' and (I believe) `texindex' are part of the GNU Texinfo
package.  TeX is freely available from Stanford (I think).  `dvips' is a
commercial product used to convert TeX DVI to Postscript -- I'm not sure
if there's a freely available version.

>   3) Restricted set of tags, which makes it fairly hard to extend
>      (except by using macros.)

You are restricted to the base tag set supported by the Texinfo tools. 
However, arbitrary TIM renamings for these are available.

>   4) Mixes macro language with markup.  Is this really a problem?
>      The TIM macros seem to primarily be used to declare context names
>      which are then translatable to generic typographic codes.  This 
>      should make it easier to move the tagged text to meaningful XML
> tags.

That's correct.  At some point a TIM parser should be written which
provides a parse tree that preserves the generic markup.

Bill


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From klm@python.org  Wed Nov 12 17:27:54 1997
From: klm@python.org (Ken Manheimer)
Date: Wed, 12 Nov 1997 12:27:54 -0500 (EST)
Subject: [DOC-SIG] [XML] Notes on the Tutorial's markup
In-Reply-To: <199711111835.NAA02624@lemur.magnet.com>
Message-ID: <Pine.GSO.3.96.971112122140.9046K-100000@glyph.cnri.reston.va.us>

[Sorry - python.org sendmail got wedged, and reissued copies of one of
andrew kuchling's messages, having this message's subject line.  It's
possible a few other messages got the same treatment, but it doesn't
look that way.  In any case, sorry about the noise...]

Ken Manheimer		klm@cnri.reston.va.us	    703 620-8990 x268
	    (orporation for National Research |nitiatives

		   # Thanks for joining the PSA!  #
		    # http://www.python.org/psa/ #


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Wed Nov 12 21:47:51 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 13:47:51 PST
Subject: [DOC-SIG] comments on Python mapping
Message-ID: <97Nov12.144751pdt."404702"@watson.parc.xerox.com>

Martin, some comments on the Python mapping:

1)  How about putting a version # on it so that we can keep track of
which version we're looking at?

2)  Python keywords:  an underscore suffix is valid OMG IDL, so it shouldn't
be used to discriminate keywords.  The current ILU mapping uses an underscore
prefix on Python keywords.

3)  Long double should use something like the thing in ILU, *not* be mapped
(with loss of information) to a Python floating point, unless that floating
point can in fact represent an OMG IDL long double.

4)  char should be mapped as an integer, to be consistent with wchar.

5)  I found the "fixed" example a bit confusing, because of the use of
"a" as a parameter in the first bulleted item.  How about saying
"fixed<foo,bar>", or some such?

6)  Can't we just use "None" for NIL objects?

7)  I'd like the "create_request" operation to take the repository ID of the
interface somehow, possibly as a keyword parameter.  The CORBA notion of
just passing the method name is inherently broken.

8)  The POA inheritance-based impl described seems to break one of the
most cherished parts of ILU, the ability to use true classes directly in
an application.  Am I wrong?  Also, I'd suggest M__POA, instead of POA_M.

9)  Is it necessary to say, ``A class may implement multiple interfaces
only if those interfaces are in a strict inheritance relationship.''
Why do we care, so long as it implements the interfaces it claims to?

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Wed Nov 12 23:02:54 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 15:02:54 PST
Subject: [DOC-SIG] comments on Python mapping
In-Reply-To: <97Nov12.144751pdt."404702"@watson.parc.xerox.com>
References: <97Nov12.144751pdt."404702"@watson.parc.xerox.com>
Message-ID: <goOXKSQB0KGWEpnKJg@holmes.parc.xerox.com>

Ooops.  Serves me right trying to do two conversations at once.  I've
resent this to the DO-sig.

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Wed Nov 12 21:55:15 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 12 Nov 1997 16:55:15 -0500
Subject: [DOC-SIG] Documentation formats
References: <199711121940.OAA12494@calum.csclub.uwaterloo.ca> <346A0B58.320E@digicool.com>
Message-ID: <346A25C3.1106CAEE@technologist.com>

Jim Fulton wrote:
> Both Frame and Word let you create documents based on structural
> elements.

That's true, but those structural elements cannot nest and their
occurrences cannot be restricted (e.g. emph in emph).

> So you can define and preserve structure while working in a WYSIWYG
> environment.

I didn't dispute that. I pointed out that you must still think about
structure. In fact, you must think about it not just as much as you
would in an SGML editor, but more, because the editor gives you no help
in proper usage. This seems like the worst of all possible worlds to me.
More work, more thinking, no more freedom (which is what we usually
expect of a WYSIWYG editor).

The best of all possible worlds (for structured documentation) is a high
quality SGML-specific word processor. Frame+SGML comes close.

 Paul Prescod


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@calum.csclub.uwaterloo.ca  Thu Nov 13 03:12:15 1997
From: papresco@calum.csclub.uwaterloo.ca (Paul Prescod)
Date: Wed, 12 Nov 1997 22:12:15 -0500 (EST)
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <9711122240.AA15058@arnold.image.ivab.se> from "Fredrik Lundh" at Nov 12, 97 11:40:17 pm
Message-ID: <199711130312.WAA01130@calum.csclub.uwaterloo.ca>

> I'd say the most important issue here is whether it's hard to write
> or not.  I don't think so, but I haven't digged into any serious DTD
> yet...

Don't try to pin that 'serious DTD' rap on SGML. :) If TIM is as structured
as anyone wants, then SGML can be as unstructured as TIM. It can also be
as tag minimized as TIM.

> Has anyone looked at RTF<->SGML conversion?  Guess that could
> allow people to use Frame or Word  (FWIW, I'm writing my book
> in Word, with an RTF template created by Frame, and the resulting
> files are converted to SGML by the ORA wizards... don't ask me
> how they do it, though).

You can convert RTF to SGML if you have no interest in taking advantage 
of SGML's greatest feature. :) I've been trying to make this point but have 
obviously not been having much success. SGML's greatest feature (which,
admittedly it shares with TIM) is that it allows you to develop new 
abstractions and tag them. RTF (and Word) is an abstraction killer. Its most 
sophisticated abstraction is the "paragraph". If we are going to start
from RTF then there is very little value in using SGML at any stage in the
process.

SGML's second greatest feature (which it does not share with TIM) is that it
is an International, and soon W3C standard with hundreds of tools and 
tens of thousands of users and sites. I guess it is vaguely possible that
one of those tools will be useful with our "RTF-Demented SGML" but it isn't
likely. SGML is designed to be a source format, not a converted-to format.

But if you don't want to type tags at all, there is Frame+SGML and even
(uck) "SGML Author for Word". You still have to think about *structure* but
you don't have to type tags. In my experience, however, changing styles from
one to the other is no easier than typing tags. You can bind your styles
to a hotkey, but you can do the same with tags in Emacs.

> Or is the Emacs SGML mode good enough?

I think so. My only concern is that I have found it slow with huge DTDs on
slow machines. At home (P100, 32MB) it is quite good. With reasonably sized
DTDs (e.g. DocBook subsets) it is also quite good.

 Paul Prescod


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Thu Nov 13 15:13:34 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Thu, 13 Nov 1997 10:13:34 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <346A2D47.718C60DC@technologist.com>
References: <3469D7C5.F90F32F9@technologist.com>
 <199711121713.MAA28481@lemur.magnet.com>
 <199711121931.OAA24333@fermi.eeel.nist.gov>
 <346A2D47.718C60DC@technologist.com>
Message-ID: <199711131513.KAA24509@weyr.cnri.reston.va.us>


Paul Prescod writes:
 > In short, I think that subscribing to standards is the Right Thing

  I concur.

 > If we want, we can also define an SGML subset simple enough to be parsed
 > with Python tools alone. We will just take XML and add back in the
 > shortcuts we like from SGML and fix up sgmllib.py to support them. There
 > is no reason that SGML should be harder to parse than TIM if we restrict
 > ourselves to a subset. Really the only thing that's very hard about
 > generic SGML is automatic tag omission. If we forgo that (as TIM does)
 > then SGML is not really hard to parse.

  The SGMLParser class from Grail is much better about SGML shortcuts
in "strict" mode (the non-strict mode is intended to support Web-style 
HTML, i.e., invalid, and is not interesting for us).  It supports
<emph/null/ end tags, <emph>empty</> end tags, and I think <>empty
start tags</> are tolerably o.k., but I'm less convinced I understand
the correct behavior, and haven't had any time to really validate it
against SP.
  I remember reading something that indicated the null end tags should
be discouraged.  Can you fill us in on the SGML community's current
attitude on this?  Does this only apply in the presence of SGML
editors like FM+SGML or should the avoidance also apply to manually
applied & revised markup?


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Thu Nov 13 16:14:31 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Thu, 13 Nov 1997 11:14:31 -0500
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
In-Reply-To: <3469B8F9.32838B0@technologist.com>
References: <199711111835.NAA02624@lemur.magnet.com>
 <3468D0BE.69C2052A@technologist.com>
 <199711112204.RAA22015@weyr.cnri.reston.va.us>
 <3469B8F9.32838B0@technologist.com>
Message-ID: <199711131614.LAA24610@weyr.cnri.reston.va.us>


Paul Prescod writes:
 > I think we should tackle the tutorial first, as it will require less
 > custom markup and programming.

  As Guido pointed out, there are non-technical reasons not to mess
with this one.  I think the primary advantages or SGML/XML come about
when dealing with something interesting like the Library Reference,
which offers a need to heavily structured data and substanstial
sections of prose.

 > >   Regarding processing, I'd have no problems using SP to do this; a
 > > Python interface to the generic interface would not be difficult to
 > > create, if a little tedious.  I'm willing to do this, but it would be
 > > evenings / weekends, and only if it'll get used.
 > 
 > If you do this, I would strongly encourage you to skip the Generic
 > Interface and move to the more poweful Grove Interface. On Windows, this

  When I last looked at the interfaces, jade and the grove interface
were new.  I'll take a look at the grove interface when I get a
chance; it does sound like it would be more useful.

 > But anyhow, cool as the grove interface is, it isn't clear yet that we
 > need any interface for this particular project. Hopefully we can depend
 > on the existing tools (Jade and existing stylesheets). Once we want to
 > go beyond their capabilities, we must decide whether to extend Python to

  For static output there should be no need to use anything other than 
jade & a collection of stylesheets.  I was thinking more along the
lines of run-time access to the library reference, which could be used 
to support interactive help systems and the like.  It may be that
static generation of a more Python-friendly format would be
appropriate; perhaps a shelf as a repository for small documentation
objects that could then be rendered for display at runtime based on
user preferences or context.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Thu Nov 13 16:21:18 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Thu, 13 Nov 1997 11:21:18 -0500
Subject: SV: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <3469E3F8.8C4BAD63@technologist.com>
References: <9711121628.AA05017@arnold.image.ivab.se>
 <3469E3F8.8C4BAD63@technologist.com>
Message-ID: <199711131621.LAA24617@weyr.cnri.reston.va.us>

Paul Prescod writes:
 > >From (La)TeX? Not really. Parsing LaTeX is not only difficult, but
 > relatively undefined. There is no one language called "LaTeX" it is
 > really a family of languages more or less defined by the Lamport book.

  Python at one point (fairly recently) included a script that
converted LaTeX from the library reference to texinfo, so it's
actually not too painful as measured in development time, but would
need to be manually fixed up afterwards.  I'd expect this to be a
one-time-only conversion, but a tool is probably the right way to do
it.  The library reference is mostly pretty well structured.

 > And lots of LaTeX documents mix generic structures and formatting
 > interchangably. Finally, there is no easy way to figure out how to
 > handle macros. Should they be expanded to their TeX primitives (uck)? If
 > not, how do we know how to represent user defined macros in the target
 > DTD? If you make a "foobar" macro, what do I do with it in SGML?

  In the Python documentation, macro definitions are largely reserved
for the mystyle.sty file, and everything else uses those macros.  So
it's much less adhoc than general LaTeX.  I don't expect a general
tool for TeX->SGML can be developed in a finit time.  I certainly have 
no intention to try!


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Thu Nov 13 16:29:24 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Thu, 13 Nov 1997 11:29:24 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711121722.MAA01255@eric.CNRI.Reston.Va.US>
References: <346943C3.91CCF8FC@technologist.com>
 <199711121441.JAA00616@eric.CNRI.Reston.Va.US>
 <3469D7C5.F90F32F9@technologist.com>
 <199711121722.MAA01255@eric.CNRI.Reston.Va.US>
Message-ID: <199711131629.LAA24629@weyr.cnri.reston.va.us>


Guido van Rossum writes:
 > I just don't like the fact that SGML makes characters that occur
 > frequently in Python source code like "<" and "/" special.  Also the

  As Paul pointed out, this is pretty bogus.  The only sort of
conflict I can see which could cause legal Python code to be
intepreted as an SGML or XML construct would be something like this:

	ok = ok&flag; print ok
	       ^^^^^^

  This is legal Python, but ugly as hell, and I don't think I've ever
seen the "&" operator used without spaces.  So I'm not concerned.

 > fact that SGML parsers that support the full syntax are either costly
 > in money or in resources (few sites that I know have an SGML parser

  Again, as Paul pointed out, SP and jade are free and substantially
cross platform as long as a solid C++ compiler is available.  (gcc
counts.)  If you're worried about having to install this stuff at
CNRI, know that jade 1.0 has been installed for a while.  ;-)  I think 
1.1 is out; if so I'll upgrade our installation.
  The tools are not unreasonable, they're just not written in Python.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Thu Nov 13 21:14:41 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Thu, 13 Nov 1997 13:14:41 PST
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711130312.WAA01130@calum.csclub.uwaterloo.ca>
References: <199711130312.WAA01130@calum.csclub.uwaterloo.ca>
Message-ID: <ooOqr10B0KGWApnIsG@holmes.parc.xerox.com>

Excerpts from ext.python: 12-Nov-97 Re: [DOC-SIG] Comparing SGM.. Paul
Prescod@calum.csclu (2299*)

> RTF (and Word) is an abstraction killer. Its most 
> sophisticated abstraction is the "paragraph". If we are going to start
> from RTF then there is very little value in using SGML at any stage in the
> process.

I completely agree.  RTF is the wrong direction.  As is HTML or Texinfo,
for the same reason -- too concrete.  TIM and XML are attempts at
removing that problem from Texinfo and HTML, respectively.

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From amk@magnet.com  Thu Nov 13 21:30:49 1997
From: amk@magnet.com (Andrew Kuchling)
Date: Thu, 13 Nov 1997 16:30:49 -0500 (EST)
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <ooOqr10B0KGWApnIsG@holmes.parc.xerox.com> (message from Bill
 Janssen on Thu, 13 Nov 1997 13:14:41 PST)
Message-ID: <199711132130.QAA11204@lemur.magnet.com>

All this discussion of SGML/XML is moot if we can't fix the basic
problem of SGML's special characters conflicting with Python's.  Is
that problem solvable?  Paul, you mentioned that it's possible to use
characters other than <> in SGML, but it's not commonly done.  Why
not?  What about XML?  Would it be possible to write a DTD that looked
like TIM, instead of looking like HTML?


	Andrew Kuchling
	amk@magnet.com
	http://starship.skyport.net/crew/amk/


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Fri Nov 14 01:08:29 1997
From: papresco@technologist.com (Paul Prescod)
Date: Thu, 13 Nov 1997 20:08:29 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
References: <199711132130.QAA11204@lemur.magnet.com>
Message-ID: <346BA48D.401C61AF@technologist.com>

Andrew Kuchling wrote:
> 
> All this discussion of SGML/XML is moot if we can't fix the basic
> problem of SGML's special characters conflicting with Python's.  Is
> that problem solvable?  Paul, you mentioned that it's possible to use
> characters other than <> in SGML, but it's not commonly done.  Why
> not?  What about XML?  Would it be possible to write a DTD that looked
> like TIM, instead of looking like HTML?

There is no basic problem. SGML/XML has two basic markup-staring
delimiters, "&" and "<". TIM has three, it seems "@", and "{}". As far
as I can tell, the TIM characters appear in Python docs just as often as
the SGML ones. Both languages have techniques for supressing markup
recognition ("escaping").

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Fri Nov 14 02:15:13 1997
From: papresco@technologist.com (Paul Prescod)
Date: Thu, 13 Nov 1997 21:15:13 -0500
Subject: [DOC-SIG] SGML for Python Documentation
Message-ID: <346BB431.52CA5300@technologist.com>

I spent the evening on a miniature version of the process of moving the
Python library docs into SGML. I think this will demonstrate SGML's
suitability to the task. The actual document I moved over is a subset of
the TIM document posted yesterday (using ILU and Python). In particular
I:

 * Whipped up a mini-DTD that blended DocBook and the ILU special
abstractions ("metavar", "language", etc.)
 * Encoded some of the TIM docs in my new SGML-based language
 * Wrote a quickie mapping from the ILU abstractions to built-in DocBook
elements
 * Ran the result through Norm Walsh's DocBook DSSSL stylesheets for
print and HTML
 * Loaded the resulting RTF file into Word
 * Made a PostScript file (warning -- Word PS files are "funny" -- I
wouldn't copy them directly to a PS printer if I were you)

All of this was done with free tools except for making the PostScript
file. Theoretically I could have also done that with Microsoft's free
"RTF Viewer" or with TeX. All of the source and result files are at:
http://itrc.uwaterloo.ca/~papresco/ilusgml.zip. All you need to run the
stylesheets is Jade, from http://www.jclark.com/jade. It compiles easily
on every platform I have tried.

I think that the resulting PostScript and HTML files are beautiful. If
there is some aspect that is not beautiful, we can upgrade the
stylesheets quite easily. The complete source is in the zip file on my
website.

I include the contents of the SGML in this email because people on the
list were curious what SGML/DocBook looks like. I could have actually
made the SGML much smaller, but I stuck to a very simple subset of SGML.
I suspect it is the exact subset that sgmllib.py can parse, so it is
already "Python compatible". Were I to use completely idiomatic SGML, I
could reduce the markup by quite a bit, but then sgmllib.py would not be
able to parse it anymore.

I believe that I have thus refuted the arguments that SGML is verbose,
too hard to parse, too expensive and otherwise not appropriate for the
task.

 Paul Prescod

<!DOCTYPE BOOK SYSTEM "ilubook.dtd">
<book>
    <title>Using ILU with Python:  A Tutorial</>

  <bookinfo>
    <author>Bill Janssen</author>
    <copyright><year>1995 </> <holder>Xerox Corporation</></>
  </bookinfo>

<chapter>
    <title>Introduction</>

<para>This tutorial will show how to use the <system>ILU</> <system/ILU/
system with the programming language <language/python/, both as a way of
developing software libraries, and as a way of building distributed
systems. In an extended example, we'll build an <system/ILU/ module that
implements a simple four-function calculator, capable of addition,
subtraction, multiplication, and division.  It will signal an error if
the user attempts to divide by zero.  The example demonstrates how to
specify the interface for the module; how to implement the module in
<language/python/; how to use that implementation as a simple library;
how to
provide the module as a remote service;  how to write a client of that
remote service; and how to use subtyping to extend an object type and
provide different versions of a module.  We'll also demonstrate how to
use <language/OMG IDL/ with <system/ILU/, and discuss the notion of
network
garbage collection.</>

<para>Each of the programs and files referenced in this tutorial is
available as a complete program in a separate appendix to this
document; parts of programs are quoted in the text of the tutorial.</>
</chapter>

<chapter>
    <title>Specifying the Interface</>
<para>Our first task is to specify more exactly what it is we're
trying to provide.  A typical four-function calculator lets a user
enter a value, then press an operation key, either +, -, /, or *, then
enter another number, then press = to actually have the operation
happen.  There's usually a CLEAR button to press to reset the state of
the calculator.  We want to provide something like that.</>

<para>We'll recast this a bit more formally as the <FirstTerm/interface/
of our module; that is, the way the module will
appear to clients of its functionality.  The interface
typically describes a number of function calls which can be
made into the module, listing their arguments and return types,
and describing their effects.  <system/ILU/ uses
<FirstTerm/object-oriented/
interfaces, in which the functions in the interface are grouped
into sets, each of which applies to an <FirstTerm/object type/.  These
functions are called <FirstTerm/methods/.</>

<para>For example, we can think of the calculator as an object type,
with several methods:  Add, Subtract, Multiply, Divide, Clear, etc.
<system/ILU/ provides a standard notation to write this down with,
called <FirstTerm/ISL/ (which stands for ``Interface Specification
Language'').
<language/ISL/ is a declarative language which can be processed
by computer programs.  It allows you to define object types (with
methods),
other non-object types, exceptions, and constants.</>

<para>The interface for our calculator would be written in ISL as:</>

<programlisting>
INTERFACE Tutorial;

EXCEPTION DivideByZero;

TYPE Calculator = OBJECT
  METHODS
    SetValue (v : REAL),
    GetValue () : REAL,
    Add (v : REAL),
    Subtract (v : REAL),
    Multiply (v : REAL),
    Divide (v : REAL) RAISES DivideByZero END
  END;
</programlisting>

<para>This defines an interface <isl/Tutorial/, an exception
<isl/DivideByZero/, and an object type <isl/Calculator/.  Let's
consider these one by one. The interface, <isl/Tutorial/, is a way of
grouping a number of type and exception definitions.  This is
important to prevent collisions between names defined by one group and
names defined by another group. For example, suppose two different
people had defined two different object types, with different methods,
but both called <isl/Calculator/! It would be impossible to tell which
calculator was meant.  By defining the <isl/Calculator/ object type
within the scope of the <isl/Tutorial/ interface, this confusion can
be avoided.</>

<para>The exception, <isl/DivideByZero/, is a formal name for a
particular
kind of error, division by zero.  Exceptions in <system/ILU/ can specify
an <FirstTerm/exception-value type/, as well, which means that real
errors
of that kind have a value of the exception-value type associated with
them.
This allows the error to contain useful information about why it might
have come about.  However, <isl/DivideByZero/ is a simple exception,
and has no exception-value type defined.  We should note that the full
name of this exception is <isl/Tutorial.DivideByZero/, but for this
tutorial we'll simply call our exceptions and types by their short
name.</>

<para>The object type, <isl/Calculator/ (again, really
<isl/Tutorial.Calculator/),
is a set of six methods.  Two of those methods, <isl/SetValue/ and
<isl/GetValue/, allow us to enter a number into the calculator object,
and ``read'' the number.  Note that <isl/SetValue/ takes a single
argument, <metavar/v/, of type <type/REAL/.  <type/REAL/ is a
built-in <language/ISL/ type, denoting a 64-bit floating point number.
Built-in <language/ISL/ types are things like <type/INTEGER/ (32-bit
signed integer), <type/BYTE/ (8-bit unsigned byte), and <type/CHARACTER/
(16-bit Unicode character).  Other more complicated types are
built up from these simple types using <language/ISL/ <FirstTerm/type
constructors/,
such as <isl/SEQUENCE OF/, <isl/RECORD/, or <isl/ARRAY OF/.</>

<para>Note also that <isl/SetValue/ does not return a value,
and neither do <isl/Add/, <isl/Subtract/, <isl/Multiply/,
or <isl/Divide/.  Rather,
when you want to see what the current value of the calculator
is, you must call <isl/GetValue/, a method which has no arguments,
but which returns a <type/REAL/ value, which is the value of the
calculator object.  This is an arbitrary decision on our part;
we could have written the interface differently, say as</>

<programlisting>
TYPE NotOurCalculator = OBJECT
  METHODS
    SetValue () : REAL,
    Add (v : REAL) : REAL,
    Subtract (v : REAL) : REAL,
    Multiply (v : REAL) : REAL,
    Divide (v : REAL) : REAL RAISES DivideByZero END
  END;
</programlisting>

<para>-- but we didn't.</>

<Para>Our list of methods on <type/Calculator/ is bracketed by the two
keywords <isl/METHODS/ and <isl/END/, and the elements are separated
from each other by commas.  This is pretty standard in <language/ISL/:
elements of a list are separated by commas; the keyword <isl/END/ is
used when an explicit list-end marker is needed (but not when it's not
necessary, as in the list of arguments to a method); the list often
begins with some keyword, like <isl/METHODS/. The <FirstTerm/raises
clause/
(the list of exceptions which a method might raise) of the method
<isl/Divide/ provides another example of a list, this time with only
one member, introduced by the keyword <isl/RAISES/.</>

<para>Another standard feature of <language/ISL/ is separating a name,
like <isl/v/, from a type, like <type/REAL/, with a colon character.
For example, constants are defined with syntax like</>

<programlisting>
CONSTANT Zero : INTEGER = 0;
</>

<para>Definitions, of interface, types, constants, and exceptions, are
terminated with a semicolon.
</>

<para>We should expand our interface a bit by adding more documentation
on what our methods actually do.  We can do this with the
<FirstTerm/docstring/
feature of <language/ISL/, which allows the user to add arbitrary
text to object type definitions and method definitions.  Using
this, we can write</>

<programlisting>
INTERFACE Tutorial;

EXCEPTION DivideByZero
  "this error is signalled if the client of the Calculator calls
the Divide method with a value of 0";

TYPE Calculator = OBJECT
  COLLECTIBLE
  DOCUMENTATION "4-function calculator"
  METHODS
    SetValue (v : REAL) "Set the value of the calculator to `v'",
    GetValue () : REAL  "Return the value of the calculator",
    Add (v : REAL)      "Adds `v' to the calculator's value",
    Subtract (v : REAL) "Subtracts `v' from the calculator's value",
    Multiply (v : REAL) "Multiplies the calculator's value by `v'",
    Divide (v : REAL) RAISES DivideByZero END
      "Divides the calculator's value by `v'"
  END;
</programlisting>

<para>Note that we can use the <isl/DOCUMENTATION/ keyword on object
types to add documentation about the object type, and can simply add
documentation strings to the end of exception and method definitions.
These docstrings are passed on to the <language/python/ docstring
system, so
that they are available at runtime from <language/python/. 
Documentation
strings cannot currently be used for non-object types.</>
</chapter>
</book>

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Fri Nov 14 02:29:41 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Thu, 13 Nov 1997 18:29:41 PST
Subject: [DOC-SIG] SGML for Python Documentation
In-Reply-To: <346BB431.52CA5300@technologist.com>
References: <346BB431.52CA5300@technologist.com>
Message-ID: <soOvSJ0B0KGWQpnTxi@holmes.parc.xerox.com>

Excerpts from ext.python: 13-Nov-97 [DOC-SIG] SGML for Python D.. Paul
Prescod@technologis (10796*)

>  * Ran the result through Norm Walsh's DocBook DSSSL stylesheets for
> print and HTML
>  * Loaded the resulting RTF file into Word
>  * Made a PostScript file (warning -- Word PS files are "funny" -- I
> wouldn't copy them directly to a PS printer if I were you)

Yes, it's this kind of somewhat-defective tool chain that makes me
mistrust most current SGML-based solutions that I've seen.

My requirements:

    - must be able to produce good plain text, Postscript or PDF, and
    HTML versions of any document encoded in any new documentation
    format;
    - must be able to produce those automatically from the input, using
    a script, not through any tools that require user interaction;
    - tool chain must run on both UNIX and Windows

Unless the SGML tool chain satisfies those requirements, I'd keep looking.

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Fri Nov 14 02:48:01 1997
From: papresco@technologist.com (Paul Prescod)
Date: Thu, 13 Nov 1997 21:48:01 -0500
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
References: <199711111835.NAA02624@lemur.magnet.com>
 <3468D0BE.69C2052A@technologist.com>
 <199711112204.RAA22015@weyr.cnri.reston.va.us>
 <3469B8F9.32838B0@technologist.com> <199711131614.LAA24610@weyr.cnri.reston.va.us>
Message-ID: <346BBBE1.EC4BEB6B@technologist.com>

Fred L. Drake wrote:
>   For static output there should be no need to use anything other than
> jade & a collection of stylesheets.  I was thinking more along the
> lines of run-time access to the library reference, which could be used
> to support interactive help systems and the like.  It may be that
> static generation of a more Python-friendly format would be
> appropriate; perhaps a shelf as a repository for small documentation
> objects that could then be rendered for display at runtime based on
> user preferences or context.

That would be cool. It would also be pretty trivial on Windows right
now. Getting a grove from SP takes like two lines of code:

import groveoa
groveoa.GroveBuilder().parse( "libref-ch3.sgm" )

I only mention this because one of my memes is that things should be as
easy on Unix as under Windows -- in other words there should be good
tools for making ILU bindings and people should write ILU bindings
instead of language-specific bindings whene ever possible.

 Paul Prescod


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Fri Nov 14 02:55:37 1997
From: papresco@technologist.com (Paul Prescod)
Date: Thu, 13 Nov 1997 21:55:37 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
References: <3469D7C5.F90F32F9@technologist.com>
 <199711121713.MAA28481@lemur.magnet.com>
 <199711121931.OAA24333@fermi.eeel.nist.gov>
 <346A2D47.718C60DC@technologist.com> <199711131513.KAA24509@weyr.cnri.reston.va.us>
Message-ID: <346BBDA9.997D6D62@technologist.com>

Fred L. Drake wrote:
>   The SGMLParser class from Grail is much better about SGML shortcuts
> in "strict" mode (the non-strict mode is intended to support Web-style
> HTML, i.e., invalid, and is not interesting for us).  It supports
> <emph/null/ end tags, <emph>empty</> end tags, and I think <>empty
> start tags</> are tolerably o.k., but I'm less convinced I understand
> the correct behavior, and haven't had any time to really validate it
> against SP.

I think it's easy to understand, but it's not a feature I use. We would
probably avoid it for our subset.

>   I remember reading something that indicated the null end tags should
> be discouraged.  Can you fill us in on the SGML community's current
> attitude on this?  Does this only apply in the presence of SGML
> editors like FM+SGML or should the avoidance also apply to manually
> applied & revised markup?

I don't know of a problem with null end tags, but I very rarely use SGML
tools other than nsgmls, jade and emacs. Still, as you have pointed out,
they are very easy to implement. Eric Naggum was always the most picky
about proper markup and I don't remember him saying anything against
NET. I guess you could run into a problem with <abc/1/2/, but you'd
probably notice you.

 Paul Prescod


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From fredrik@pythonware.com  Fri Nov 14 12:00:17 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Fri, 14 Nov 1997 13:00:17 +0100
Subject: [DOC-SIG] SGML for Python Documentation
Message-ID: <01bcf0f4$df736760$6fadb4c1@fl-pc.image.ivab.se>

> * Made a PostScript file (warning -- Word PS files are "funny" -- I
>wouldn't copy them directly to a PS printer if I were you)

Depends on how you configure your printer, really.  Here's a
trick I'm using to get perfectly portable files:

1. install the QMS 810 driver (a good ole PostScript level 1 printer)
2. rename it as "Plain PostScript" or something, and route it
   to the FILE device.

The files you get will print on virtually everything, especially
if you avoid non-standard fonts.

> I believe that I have thus refuted the arguments that SGML is verbose,
> too hard to parse, too expensive and otherwise not appropriate for the
> task.

I'm convinced! ;-)

Thanks /F

PS. To please users without access to either Word or Frame, is there
some way to go from RTF -> PostScript out there?  Perhaps via groff?


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Fri Nov 14 14:27:42 1997
From: papresco@technologist.com (Paul Prescod)
Date: Fri, 14 Nov 1997 09:27:42 -0500
Subject: [DOC-SIG] SGML for Python Documentation
References: <01bcf0f4$df736760$6fadb4c1@fl-pc.image.ivab.se>
Message-ID: <346C5FDE.2C485F3B@technologist.com>

Fredrik Lundh wrote:
> 
> PS. To please users without access to either Word or Frame, is there
> some way to go from RTF -> PostScript out there?  Perhaps via groff?

RTF->Postscript is easy on Windows, with or without Word. I don't know
of a way to do it on other platforms.

I do know you can use the same DocBook stylesheet to go
SGML--[Jade]-->TeX--[TeX]-->dvi--[dvips]-->postscript, but you have to
install some TeX macro packages.

Oh yeah, and you can also go SGML--[Jade]-->TeX--[Texpdf]-->PDF

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Fri Nov 14 14:40:15 1997
From: papresco@technologist.com (Paul Prescod)
Date: Fri, 14 Nov 1997 09:40:15 -0500
Subject: [DOC-SIG] SGML for Python Documentation
References: <346BB431.52CA5300@technologist.com> <soOvSJ0B0KGWQpnTxi@holmes.parc.xerox.com>
Message-ID: <346C62CF.1F2E3061@technologist.com>

Bill Janssen wrote:
> Yes, it's this kind of somewhat-defective tool chain that makes me
> mistrust most current SGML-based solutions that I've seen.
> 
> My requirements:
> 
>     - must be able to produce good plain text, Postscript or PDF, and
>     HTML versions of any document encoded in any new documentation
>     format;
>     - must be able to produce those automatically from the input, using
>     a script, not through any tools that require user interaction;
>     - tool chain must run on both UNIX and Windows
> 
> Unless the SGML tool chain satisfies those requirements, I'd keep looking.

The SGML tool chain can satisify these requirements by going through TeX
instead of RTF. On my machine RTF is simpler because I don't have a
modern TeX installed. If TIM's output formats look especially
interesting for some particular project (texinfo for Emacs?) we could go
through sgmllib.py->TIM as well.

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Edward Welbourne <eddyw@lsl.co.uk>  Fri Nov 14 16:49:57 1997
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Fri, 14 Nov 1997 16:49:57 GMT
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711132130.QAA11204@lemur.magnet.com>
References: <ooOqr10B0KGWApnIsG@holmes.parc.xerox.com>
 <199711132130.QAA11204@lemur.magnet.com>
Message-ID: <9711141649.AA29880@lslr6g.lsl.co.uk>

> All this discussion of SGML/XML is moot if we can't fix the basic
> problem of SGML's special characters conflicting with Python's.

In the main pages of documentation, the manuals &c., we don't have to
worry about all this because we can use fancy tools to do our document
generation for us, so the fact that some folk find typing raw SGML
tedious doesn't raise itself as an objection.  Use of Frame+SGML,
(modern) emacs SGML mode or one of the better WYSI-more-or-less-WYG SGML
editors will suffice.  The only place where there's a problem with SGML
is in the labour of writing and the ugliness of reading doc strings.

In a doc string, the only special characters are \, % and the quote
character used to delimit the string (right ?).  Furthermore,
doc-strings are triple-quoted, so quote characters only matter if they
happen in triplicate: which doesn't happen in HTML.  I don't see HTML
using \.  And % only presents a problem if we want to use doc-strings as
format strings: I don't.

The fact that <EM>outside strings</EM> python has special readings of <,
>, &, / and ; is not an issue: we only intend to write our
XML/TIM/... inside doc strings or in files which aren't python code.

I see no conflict in need of a fix here.
Enlighten me if I've missed the point.

	Eddy.

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From tcsender@get-more-hits.com  Wed Nov 12 11:00:28 1997
From: tcsender@get-more-hits.com (tcsender@get-more-hits.com)
Date: Wed, 12 Nov 97 06:00:28 EST
Subject: [DOC-SIG] Put Your Site at the TOP of the Search Engines !
Message-ID: <637182719127.tcsender@get-more-hits.com>

11/15/97

Dear Friend and Fellow Entrepreneur,

DISCOVER The Most Powerful & PROVEN Strategies that Really Work To Place 
You At The Top of the Search Engines!

You will only receive this offer once.

If you have a web page, or site, that can't be found at the top of the 
search engines, then this will be the most important information you will 
ever read. You are about to Discover the most Powerful Strategies used 
only by the very best on the Web... strategies so Powerful that once used 
will place your Web Page or site at the TOP 10 - 20 search engine listings!

These TOP SECRET strategies will provide you with a cutting edge advantage
over your competition and give you the long awaited results you have been 
looking for.  Just imagine opening a Floodgate of People into your Home Page 
because you have the right information.  It doesn't matter if you have one 
page or 1000 pages--you can achieve a top rating with this powerful 
information and soon squash your competition!

This 25 page in-depth report covers:

>Search Engine Tactics your competition doesn't want you to know!

>The best kept secrets to getting you a top 10 - 20 listing! The 10 top
  keywords searched for!� Getting better positioning than your opposition
  even when they have the same identical keywords!

>Proven techniques for selecting the most effective keywords and how to 
  arrange them!

>A powerful way to get your listing seen by potential customers, even if 
  they're not looking for you!

>A little-known way to get multiple listings for your site in the same 
  search engine!

>Proven strategies used to resubmit your page or site and get that top 
  rating even if you have it listed already!

>How to get people to go to your site first even if they see your 
  competition!
 
>The most powerful words used to create the best Web Pages! 

>A Web tool used to market successfully in the Newsgroups!

>Five things you should NEVER do!
 
If you aren't at the top of the search engines now... your competition is!

It's estimated that over 1000 new Web Pages are coming online every day!
Newspapers are reporting over 14,000 new www addresses are being submitted 
every week.  The competition grows every minute!  It just makes sense that 
those who know and apply this information will definitely have the best 
chance of realizing their dreams of success. 

This in-depth report is normally US$49.95... However, if you order within 
the next 10 days... we'll include ABSOLUTELY FREE... OVER 1000 Links where 
you can advertise your web site FREE and you can have it all for JUST 
US$19.95!  This INVALUABLE information alone is worth the asking price!  
Don't delay... this Extraordinary and Valuable Information can be yours 
today for ONLY $19.95 (USA FUNDS).  Why Wait... Order Right Now!

As an added BONUS, if you respond within 10 days:

You'll also receive free tools, images, and tips to help you with Your Web 
Page construction, including free CGI scripts, buttons, backgrounds, and 
loads of Jpegs and Gifs, including animated Gifs!

Please print, cut, and fill out the following order coupon:

----------------------------------------------------------------------------
ATTN: Please type or print legibly to ensure timely delivery.

          Name
          __________________________________________________________

         !E-mail Address (Required)
          __________________________________________________________

          Address
          __________________________________________________________

          City ______________________ State ________ Zip ___________
         
          Country______________________

          Phone #______________________

$19.95     SEARCH ENGINE SECRETS     (US Dollars)      
$_____    Sales Tax (MA residents 5.00%)
$_____    Order Total                          

PAYMENT BY:

 ___ Personal/Business Check ___ Money Order ___ Cashiers Check-US FUNDS only!

PREFERRED FORMAT (Please check one or more of the following):

___ ASCII ___ Word 2.x for Windows ___Word 6.x  ___ Word 7.x  ___ Zipped

>>If you're ordering from outside the USA, only a Money Order in US Dollars
 will be accepted.  No postal delivery is available outside the USA, so you
 must include your E-mail address accurately and legibly.  If you do not 
 currently have an E-mail address, please get permission to use a friend's.  <<
 
Discount expires 11-25-97.
--------------------------------------------------------------------------------------

For fastest service use Cashiers Check or Money Order.

Please include your e-mail address for 24 hour order processing.  Please 
allow 2 weeks for processing by regular postal mail. 

Please make payable to ->  EVA, Inc. 
and send to:

EVA, Inc.
43 Riverside Ave.
Suite 72
Medford, MA 02155
USA

Reminder:  Your order must be postmarked by Tuesday, November 25th in order
to receive the bonuses. 


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Sat Nov 15 17:50:03 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Sat, 15 Nov 1997 12:50:03 -0500
Subject: [DOC-SIG] Doc strings debate
Message-ID: <199711151750.MAA19469@eric.CNRI.Reston.Va.US>

There seem to be a number of different questions here; I'll try to
discuss them separately.  This message pertains to doc strings.  A
separate message will discuss the library reference manual.

As I see it, it's up to each project to decide what to do about doc
strings.  Some choices are:

- no doc strings

- terse text only doc strings, for quick on-line reminders only

- longer text only doc strings, mostly for reference by the programmer
  who is reading the source code (i.e. doc strings are just syntactic
  sugar for comments)

- longer doc strings with some markup (e.g. stext or a very limited
  HTML subset) that could be used to generate on-line documentation and
  printed documentation

- full "literate programming" doc strings, with elaborate markup; a
  preprocessor may be needed to extract Python source with smaller doc
  strings

The choice depends on the goals of the project as well as on the
availability of tools.  There seem to be some tools but they all seem
to have some shortcomings. The debate on what the tools should do is
endless; one of the reasons is that the project goals differ.

In my own style of working, I prefer terse or longer text-only doc
strings, since I am not interested in generating printed documentation
from the doc strings.  This means I don't have much use for tools (the
only tool that makes sense would be some kind of class browser that
has good support for displaying doc strings).  I don't think that the
availability of other tools would affect my style of working much; but
I realize it's a personal choice and I don't want to impose it on the
Python community as the only way to use doc strings.  However, I will
continue to use this style for the standard Python library.  Given
that I am doing most of the work here I think I have that prerogative.

I hope tools become available that give the authors of other projects
more choice -- as it stands, if gendoc doesn't do what you want,
you're basically forced to write your own tools.  Some of the existing
"hacks" deserve to be refined into more generally useful tools.  The
doc sig can contribute here by discussing the requirements for a
number of different tools, for projects with different ambitions.

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Daniel.Larsson@vasteras.mail.telia.com  Sat Nov 15 18:47:58 1997
From: Daniel.Larsson@vasteras.mail.telia.com (Daniel Larsson)
Date: Sat, 15 Nov 1997 19:47:58 +0100
Subject: [DOC-SIG] Doc strings debate
Message-ID: <01bcf1f7$0e5ca520$25bc43c3@Daniel.telia.com>

This is a multi-part message in MIME format.

------=_NextPart_000_0014_01BCF1FF.70210D20
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit


-----Original Message-----
From: Guido van Rossum <guido@CNRI.Reston.Va.US>
To: doc-sig@python.org <doc-sig@python.org>
Date: den 15 november 1997 19:29
Subject: [DOC-SIG] Doc strings debate


>There seem to be a number of different questions here; I'll try to
>discuss them separately.  This message pertains to doc strings.  A
>separate message will discuss the library reference manual.
>
>As I see it, it's up to each project to decide what to do about doc
>strings.  Some choices are:
>
>- no doc strings
>
>- terse text only doc strings, for quick on-line reminders only
>
>- longer text only doc strings, mostly for reference by the programmer
>  who is reading the source code (i.e. doc strings are just syntactic
>  sugar for comments)
>
>- longer doc strings with some markup (e.g. stext or a very limited
>  HTML subset) that could be used to generate on-line documentation and
>  printed documentation
>
>- full "literate programming" doc strings, with elaborate markup; a
>  preprocessor may be needed to extract Python source with smaller doc
>  strings
>
>The choice depends on the goals of the project as well as on the
>availability of tools.  There seem to be some tools but they all seem
>to have some shortcomings. The debate on what the tools should do is
>endless; one of the reasons is that the project goals differ.
>
>In my own style of working, I prefer terse or longer text-only doc
>strings, since I am not interested in generating printed documentation
>from the doc strings.  This means I don't have much use for tools (the
>only tool that makes sense would be some kind of class browser that
>has good support for displaying doc strings).  I don't think that the
>availability of other tools would affect my style of working much; but
>I realize it's a personal choice and I don't want to impose it on the
>Python community as the only way to use doc strings.  However, I will
>continue to use this style for the standard Python library.  Given
>that I am doing most of the work here I think I have that prerogative.
>
>I hope tools become available that give the authors of other projects
>more choice -- as it stands, if gendoc doesn't do what you want,
>you're basically forced to write your own tools.

There is actually one other action you might want to consider if gendoc
doesn't
fit your needs: Propose what it should do and perhaps we can evolve gendoc
towards that.

What we could do is to figure out an API for extracting information out of
docstrings
which we can use for different kind of tools, such as doc generating tools
(I don't
like writing reference manuals separately, so I want tools that generate
printed
documents), class browsers, etc.

>Some of the existing
>"hacks" deserve to be refined into more generally useful tools.  The
>doc sig can contribute here by discussing the requirements for a
>number of different tools, for projects with different ambitions.
>
>--Guido van Rossum (home page: http://www.python.org/~guido/)
>
>_______________
>DOC-SIG  - SIG for the Python Documentation Project
>
>send messages to: doc-sig@python.org
>administrivia to: doc-sig-request@python.org
>_______________
>

------=_NextPart_000_0014_01BCF1FF.70210D20
Content-Type: text/x-vcard;
	name="Daniel Larsson.vcf"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="Daniel Larsson.vcf"

BEGIN:VCARD
N:Larsson;Daniel
FN:Daniel Larsson
ORG:ABB Industrial Systems AB;LKD
TITLE:Software Engineer
ADR;WORK:;;;V=E4ster=E5s;;;Sweden
LABEL;WORK;ENCODING=3DQUOTED-PRINTABLE:V=3DE4ster=3DE5s=3D0D=3D0ASweden
URL:http://starship.skyport.net/crew/danilo
EMAIL;PREF;INTERNET:Daniel.Larsson@vasteras.mail.telia.com
END:VCARD

------=_NextPart_000_0014_01BCF1FF.70210D20--


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Sat Nov 15 20:30:49 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Sat, 15 Nov 1997 15:30:49 -0500
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>

Some SGML extremists have started lobbying for SGML or XML, which has
brought up quite a religious debate (maybe started by my remark that
SGML is not fit for humans to type :-).  I feel that we're not getting
anywhere unless we face some of the facts, so here's a reality check
followed by some opinions.

I hope I've moved the doc string discussion to a separate thread.  I
don't think the library manual should be tied in with doc strings in
any way, so it can be discussed separately.

The first problem is that the library manual is currently done in
LaTeX.  I would guess that 99% of the markup is structural -- the only
places where physical markup is used in a significant way is in the
use of 'strong' and 'emphasis' to mean a number of different things
(e.g. warnings, notes, implementation restrictions, etc.).  There are
a few places where physical markup is used to overcome some formatting
weirdnesses, but I've always tried to keep these to a minimum.

Any proposed solution that doesn't take into account how to convert
the existing library manual is a trivial reject.

I see a number of problems with the use of LaTeX -- but the fact that
"it's not SGML" is not one of them.  Perhaps the biggest problem is
that LaTeX and TeX are losing popularity.  TeX may still be the
standard for respectable and somewhat conservative publications like
the Astrophysical Journal, but most publishers nowadays are just as
happy to accept MS Word or other popular wordprocessors.

I would say that the one remaining reason to use TeX or LaTeX for some
groups is that TeX does mathematics better than anything else; however
that's not relevant for the Python community.  From experience, I
would say that LaTeX does computer documentation rather poorly
(witness the many hacks in the myformat.sty file), and I haven't even
dealt properly with optional or keyword arguments, let alone classes
and methods and inheritance.

The decreasing popularity of LaTeX is a problem because it means that
potential contributors are discouraged -- many simply don't know
LaTeX, and even those that do know it may not have access to an
implementation any more.  Installing LaTeX is a major undertaking, and
one is less and less likely to find installations that already have it
installed, outside central Unix servers at academic institutions.  (I
did a web search on LaTeX for Windows 95; one of the pages,
http://www2.eece.maine.edu/~dprice/Latex/latex.htm, which seems to
have a lot of useful info, leaves me with the impression that one
needs to be *very* motivated to bring this to a good end.  It ends
with the admonition "Good Luck! You're gonna need it...")

Another problem, caused by this, is that there are few LaTeX hackers
around who can help with the creation of new macros (e.g. for keyword
arguments).

On the plus side, there is truth in the old saying "don't fix it if it
ain't broken."  I personally have access to a working LaTeX
installation, the latex2html converter produces adequate HTML (I still
need to work on the translation for a few of the environments
introduced by myformat.sty, but that shouldn't be too hard), and I
haven't heard too many complaints yet from people who would like to
contribute documentation but don't know LaTeX -- they pick it up
pretty easily from the template I provide.

*** The real problem seems how to get people to contribute at all! ***

If using SGML or XML would make more people eager to contribute, I
might be convinced; but somehow I doubt it.  At the moment, both the
learning curve and the installation effort for SGML or XML tools
appears to be still steeper than for LaTeX.

There has been some debate on SGML vs. XML.  It seems that SGML can be
made easy to type, at the cost of making it much harder to parse
correctly.  XML appears to be designed mostly as a transport format
(one page with XML info I found made the explicit point that being
easy to type was *not* a design criterium).  Anyway, once a decision
to use either is made, conversion between the two is probably easy,
especially since XML is a true subset of SGML.

Finally, TIM has been brought up.  It's a bit easier to type and more
pleasing to my eye than shorthand SGML (e.g. SGML <title>whatever</>
vs. TIM @title{whatever}) and it's a lot easier to parse.  It uses
structural markup and has a simple macro language to add new
structural elements.  This makes it relatively easy to convert to
SGML, as long as the TIM authors adhere to reasonable structuring
constraints (i.e. don't abuse constructs for different purposes).

TIM's primary weakness at the moment seems to be its toolchain, which
starts good (the parser it written in Python) but quickly runs into
problems on non-Unix platforms: for HTML generation it uses a Perl4
script, and for PostScript it goes through texinfo and hence through
tex.  For Unix, TIM's toolchain is perfect, however, and I like the
simplicity of its approach -- it should be simple enough to rewrite
the TIM-to-HTML converter in Python (maybe using HTMLgen?).

For Windows, it just *may* be possible that Word 97 will actually
parse the HTML generated by TIM so as to make it possible to generate
Postscript on Windows platforms with commonly available tools; in any
case, a prospective TIM author on a Windows platform would only need
the HTML generating part of the toolchain for on-screen previewing.

I'd love this discussion to come to an end.  I think that we would be
in good shape with TIM, *if* we solve two outstanding problems.  One
should be easy: rewriting the TIM-to-HTML tool in Python.

The other one is much hairier: conversion of the existing LaTeX source
to TIM!  This needs to be a high quality conversion, e.g. ideally it
should maintain comments and other aspects of source formatting
(like line breaks) that don't affect the generated pages but does
affect the human reader of the source, because the output of the
conversion will be edited manually henceforth.  On the other hand,
this only needs to be done once, so a small amount of manual tweaking
is acceptable.  The old conversion script (partparse.py) which I still
have laying around somewhere is probably able to do this with some
small changes (I sure hope those changes are small, because this is
one horrible piece of code... good for a one-off job though).

Those who want SGML or XGML should be able to convert TIM to their
favorite DTD using a different back end for the TIM front end.  I
would love specific feedback on the structural capabilities of TIM;
ideally, TIM should map directly onto a real SGML DTD as far as
document structure is concerned.  However, I don't want to compromise
TIM to make it possible to parse it with a generic SGML scanner; the
efforts to move HTML towards strict SGML scanner compatibility have
taught me a valuable lesson.

One final note: I looked at Perl's POD (Plain Old Documentation) for a
few seconds.  It's more limited than TIM and uses physical markup
(e.g. B<words in bold>), but has one feature that I like: a block of
indented text offset by blank lines (I believe) is automatically
interpreted as a code sample block (verbatim in LaTeX terms,
@codeexample in TIM).  This makes POD source remarkably readable.  I
presume that it would be trivial to add this to the TIM front-end.  (I
particularly like this idea because it's the same convention that I
used in the Python FAQ wizard. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Sun Nov 16 01:06:30 1997
From: papresco@technologist.com (Paul Prescod)
Date: Sat, 15 Nov 1997 20:06:30 -0500
Subject: [DOC-SIG] Library reference manual debate
References: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>
Message-ID: <346E4716.E06699EC@technologist.com>

Guido van Rossum wrote:
> 
> Any proposed solution that doesn't take into account how to convert
> the existing library manual is a trivial reject.

Although this is definately important, I don't see how that would argue
in favour of one solution or another. The output format of such a
process seems easy to handle -- it's the input that will give us a
headache.
 
> I see a number of problems with the use of LaTeX -- but the fact that
> "it's not SGML" is not one of them.  Perhaps the biggest problem is
> that LaTeX and TeX are losing popularity.  

? Popularity is a problem for TeX, but TIM, which has no popularity to
speak of and is *built on top of* TeX does not have the same problem?

> TeX may still be the
> standard for respectable and somewhat conservative publications like
> the Astrophysical Journal, but most publishers nowadays are just as
> happy to accept MS Word or other popular wordprocessors.

Right, and SGML can create those trivially. TIM cannot. I am currently
writing a book where I will give them a MIF file which they will
beautify for me.
 
> (I
> did a web search on LaTeX for Windows 95; one of the pages,
> http://www2.eece.maine.edu/~dprice/Latex/latex.htm, which seems to
> have a lot of useful info, leaves me with the impression that one
> needs to be *very* motivated to bring this to a good end.  It ends
> with the admonition "Good Luck! You're gonna need it...")

I'm not advocating that we stay with LaTeX, but I found MIKTEX to be
quite easy to install. It doesn't seem to come with all of the latest
and greatest LaTeX add-ons, but otherwise it is quite good and comes
with a good installation program.

> At the moment, both the
> learning curve and the installation effort for SGML or XML tools
> appears to be still steeper than for LaTeX.

I don't think there is anything confusing...especially if you are using
windows. Here are the steps:

Jade installation:

1. Download Jade binary or source
2. Unzip
3. Type "make" if you downloaded the source.
4. Copy binaries to some directory in your path

Python-doc package installation:

1. Download pythondoc zip file.
2. Unzip

The python-doc package contains stylesheets, sample chapters, chapter
template and DTD.

Python-doc chapter creation:

cp chapter-template.sgm my-chapter.sgm
vi my-chapter.sgm

Jade use:

jade -t tex -d style/pythondoc2print my-chapter.sgm
tex &jadetex my-chapter.tex
dvips my-chapter.dvi

OR

jade -t rtf -d style/pythondoc2print my-chapter.sgm
winword my-chapter.rtf

OR

jade -t mif -d style/pythondoc2print my-chapter.sgm
frame my-chapter.mif

OR

jade -t sgml -d style/pythondoc2html my-chapter.sgm
lynx my-chapter.html
netscape my-chapter.html

As far as truth in advertising, I should point out that if your LaTeX is
out of date, you will need to download a few style and font files here
and there to use the JadeTex package. That's why things are not QUITE as
easy on Unix as on Windows (but isn't that always the case?).

> There has been some debate on SGML vs. XML.  It seems that SGML can be
> made easy to type, at the cost of making it much harder to parse
> correctly.  

I don't see any evidence of that. If we stick to the conventions
supported by sgmllib.py, then SGML is as easy to parse as TIM, and we
already have the parser implemented. That parser is only 400 lines of
code (including test harness) and seems to handle the file I emailed a
few days ago perfectly.

> Finally, TIM has been brought up.  It's a bit easier to type and more
> pleasing to my eye than shorthand SGML (e.g. SGML <title>whatever</>
> vs. TIM @title{whatever}) and it's a lot easier to parse.  

I dunno, I consider <title/whatever/ easier to type than
@title{whatever} (look at the keyboard positions), but I acknowledge
that TIM will allow a little bit less markup in some cases. And I
certainly can't argue with pleasing to your eye. Everyone's eyes are
different...

> TIM's primary weakness at the moment seems to be its toolchain, which
> starts good (the parser it written in Python) but quickly runs into
> problems on non-Unix platforms: for HTML generation it uses a Perl4
> script, and for PostScript it goes through texinfo and hence through
> tex.  For Unix, TIM's toolchain is perfect, however, and I like the
> simplicity of its approach -- it should be simple enough to rewrite
> the TIM-to-HTML converter in Python (maybe using HTMLgen?).

So we would rather rewrite this rather than using the existing HTML
converters for DocBook? Despite the fact that the DocBook converters
already handle print properly on Windows (RTF/TeX/Frame) and
Unix(Tex/Frame)? 
 
> For Windows, it just *may* be possible that Word 97 will actually
> parse the HTML generated by TIM so as to make it possible to generate
> Postscript on Windows platforms with commonly available tools; in any
> case, a prospective TIM author on a Windows platform would only need
> the HTML generating part of the toolchain for on-screen previewing.

Printing HTML documents seems like a solution of last resort. 
 
> However, I don't want to compromise
> TIM to make it possible to parse it with a generic SGML scanner; the
> efforts to move HTML towards strict SGML scanner compatibility have
> taught me a valuable lesson.

I'm not sure what you mean by this. HTML has been SGML compatible since
version 2.0. As far as I know, it lost no useful features in the
changeover.

I can't dispute the point that SGML doesn't look nice to your eyes.
Beauty is in the eye of the beholder. That might be enough to tip the
balance in TIM's favour, but I still feel it is my responsibility to
point out that SGML is not hard to type, need not be hard to parse and
does not require difficult or expensive tools to use. If those
complaints are going to be factors in the decision then they should be
substantiated.
 
 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From cjr@bound.xs4all.nl  Sun Nov 16 01:33:39 1997
From: cjr@bound.xs4all.nl (Case Roole)
Date: Sun, 16 Nov 1997 01:33:39 +0000 (WET)
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711152030.PAA19793@eric.CNRI.Reston.Va.US> from "Guido van Rossum" at Nov 15, 97 03:30:49 pm
Message-ID: <199711160133.BAA22071@axiom.bound.xs4all.nl>


Guido wrote:

> One final note: I looked at Perl's POD (Plain Old Documentation) for a
> few seconds.  It's more limited than TIM and uses physical markup
> (e.g. B<words in bold>), but has one feature that I like: a block of
> indented text offset by blank lines (I believe) is automatically
> interpreted as a code sample block (verbatim in LaTeX terms,
> @codeexample in TIM).  This makes POD source remarkably readable.  I
> presume that it would be trivial to add this to the TIM front-end.  (I
> particularly like this idea because it's the same convention that I
> used in the Python FAQ wizard. :-)

Just wondering: for HTML generation I use "megatags", non-HTML tags in
documents that are otherwise HTML. An SGML parser (derived from the one
in sgmllib) lets pure HTML pass, but fetches and processes the data 
embedded in these megatags (example below). This is decidedly not pure 
SGML or pure HTML, but the *code is extremely readable*. Is this what 
everybody is using the SGMLParser for, is it irrelevant for the matter 
discussed here, or is this a good idea?

cjr

------------------------------------------------------------
Example:

For my curriculum vitea I wanted a list of labels and values. It seemed
best to me to use a table with labels represented as table headers and 
values as table descriptions. Both are aligned to the center of the table
which is a default that can be changed by setting the attributes
'left_align' and 'right_align'. Attributes of the table can be set by using,
e.g. 'table_border'. (This approach is entirely derived from Pmw's naming 
conventions.)

Here is what I write:

<ENTRYFORM>
naam                  =  Cornelis Jan Roele
email                 =  <MAILTAG email="cjr@bound.xs4all.nl">
geboortedatum         =  9 januari 1967
geboorteplaats        =  Doetinchem,
straat en nummer      =  Spitsbergenstraat 67
postcode/woonplaats   =  1013 CL &nbsp; AMSTERDAM
telefoon              =  020-684.62.95
</ENTRYFORM>

NB the mailtag is another "megatag".
And this is what the parser generates:

<!--   Spin.EntryForm MegaTag -->
<TABLE border="0" align="center">
  <TR>
    <TH align="right">naam:  </TH>
    <TD align="left">Cornelis Jan Roele</TD>
  </TR>
  <TR>
    <TH align="right">email:  </TH>
    <TD align="left">&lt;<A HREF="mailto:cjr@bound.xs4all.nl">cjr@bound.xs4all.nl</A>&gt;</TD>
  </TR>
  <TR>
    <TH align="right">geboortedatum:  </TH>
    <TD align="left">9 januari 1967</TD>
  </TR>
  <TR>
    <TH align="right">geboorteplaats:  </TH>
    <TD align="left">Doetinchem'),</TD>
  </TR>
  <TR>
    <TH align="right">straat en nummer:  </TH>
    <TD align="left">Spitsbergenstraat 67</TD>
  </TR>
  <TR>
    <TH align="right">postcode/woonplaats:  </TH>
    <TD align="left">1013 CL &nbsp; AMSTERDAM</TD>
  </TR>
  <TR>
    <TH align="right">telefoon:  </TH>
    <TD align="left">020-684.62.95</TD>
  </TR>
</TABLE>


-- 
Case Roole <cjr@bound.xs4all.nl>


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Sun Nov 16 04:51:32 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Sat, 15 Nov 1997 23:51:32 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>
References: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>
Message-ID: <199711160451.XAA29095@weyr.cnri.reston.va.us>


Guido van Rossum writes:
 > Some SGML extremists have started lobbying for SGML or XML, which has

  Ouch!

 > The first problem is that the library manual is currently done in
 > LaTeX.  I would guess that 99% of the markup is structural -- the only

  I don't see that this will be too difficult a conversion, actually,
primarily due to the care with which most of the markup was
performed.

 > I see a number of problems with the use of LaTeX -- but the fact that
 > "it's not SGML" is not one of them.  Perhaps the biggest problem is

  Agreed; that's not a relevant issue.

 > would say that LaTeX does computer documentation rather poorly
 > (witness the many hacks in the myformat.sty file), and I haven't even
 > dealt properly with optional or keyword arguments, let alone classes

  LaTeX isn't designed for it; it's supposed to be much more general
than that.  However, as myformat,sty shows, the appropriate markup can 
be created.  With a bit more work, the remaining semantic constructs
can be created, but they may contain a moderate amount of formatting
within the macros.  I think this is the most substantial problem with
a TeX-based solution; myformat.sty has to be completely rewritten to
change the output; the markup is not defined separately from the
processing (formatting) steps.

 > The decreasing popularity of LaTeX is a problem because it means that
 > potential contributors are discouraged -- many simply don't know

  This may have some pertinence, but the relevance is small; it's
still better than most if not all of the more popular systems.
SGML/XML is better due to the separation of semantic relations from
the processing specifications.

 > LaTeX, and even those that do know it may not have access to an
 > implementation any more.  Installing LaTeX is a major undertaking, and

  Agreed, outside some Linux distributions, LaTeX is probably a pain
unless you're willing to spring for a commercial version.  There are a 
few for PCs, but I've not followed them.

 > Another problem, caused by this, is that there are few LaTeX hackers
 > around who can help with the creation of new macros (e.g. for keyword
 > arguments).

  If that's the problem and no superior solution can be agreed upon, I
can help with that.

 > ain't broken."  I personally have access to a working LaTeX
 > installation, the latex2html converter produces adequate HTML (I still

  It's out of date and should be updated, but does work for the Python 
documentation.  I have found very reasonable LaTeX2e documents that
can't be formatted correctly using the CNRI installation.

 > There has been some debate on SGML vs. XML.  It seems that SGML can be
 > made easy to type, at the cost of making it much harder to parse
 > correctly.  XML appears to be designed mostly as a transport format

  I've seen nothing to indicate that SGML is more difficult to parse
correctly in any reasonable interpretation, and if the examples Paul
presented on shortcuts are what yo're refering to, the work's already
been done in Grail's SGMLParser module.

 > structural markup and has a simple macro language to add new
 > structural elements.  This makes it relatively easy to convert to

  The last time I looked at it, the only "structural" elements which
could be added were alternate names for the character-styling controls 
(bold, italic, etc.) and not for larger structural components.  Has
this changed?  It might have.

 > Those who want SGML or XGML should be able to convert TIM to their

  This is not an interesting issue; if you choose to use TIM for the
authoritative version, there should be no reason to convert to
SGML/XML except to have access to powerful formatting tools (DSSSL and 
jade in particular).
  It sounds as if you're convinced that there should be either no
change or conversion to TIM.  If this is the case and you won't
consider other alternatives seriously, please just say so.  I think
all of us advocating other approaches are doing to in good conscious
and not just to waste bandwidth.  If we're wasting time, we can find
more enjoyable ways to do so.
  If, on the other hand, there's a real possibility of switching to at 
all, the advocates / experts for each format / technology / whatever
you want to call it should start to develop sample processes and
converted segments of the Python documentation to allow all of us to
see the behavior of each in practice.  I don't think many of us have
actually tried to use all of the techniques being discussed.  This may 
be more productive than just shouting "Use FOOsplatter!" at each
other.  But we do need to know that we're not wasting our time at it
before we bother.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Sun Nov 16 05:11:57 1997
From: papresco@technologist.com (Paul Prescod)
Date: Sun, 16 Nov 1997 00:11:57 -0500
Subject: [DOC-SIG] Library reference manual debate
References: <199711160133.BAA22071@axiom.bound.xs4all.nl>
Message-ID: <346E809D.D48F5E4F@technologist.com>

Case Roole wrote:
> Just wondering: for HTML generation I use "megatags", non-HTML tags in
> documents that are otherwise HTML. An SGML parser (derived from the one
> in sgmllib) lets pure HTML pass, but fetches and processes the data
> embedded in these megatags (example below). This is decidedly not pure
> SGML or pure HTML, but the *code is extremely readable*. Is this what
> everybody is using the SGMLParser for, is it irrelevant for the matter
> discussed here, or is this a good idea?

SGML was explicitly designed to allow this and has features to do this
sort of thing for you. A full SGML parser can interpret your "=" symbol
and even your newlines as tags. This is very convenient for typists. I
think that for novice users it will probably be quite confusing,
however, because people are used to all SGML markup being in clearly
marked tags, not in ordinary-looking characters. Also to parse something
like this in Python we would either have to complicate sgmllib or
introduce another layer of parsing.
 
 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From cjr@bound.xs4all.nl  Sun Nov 16 10:08:13 1997
From: cjr@bound.xs4all.nl (Case Roole)
Date: Sun, 16 Nov 1997 10:08:13 +0000 (WET)
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <346E809D.D48F5E4F@technologist.com> from "Paul Prescod" at Nov 16, 97 00:11:57 am
Message-ID: <199711161008.KAA02434@axiom.bound.xs4all.nl>

Paul Prescod wrote:

> 
> Case Roole wrote:
> > Just wondering: for HTML generation I use "megatags", non-HTML tags in
> > documents that are otherwise HTML. An SGML parser (derived from the one
> > in sgmllib) lets pure HTML pass, but fetches and processes the data
> > embedded in these megatags (example below). This is decidedly not pure
> > SGML or pure HTML, but the *code is extremely readable*. Is this what
> > everybody is using the SGMLParser for, is it irrelevant for the matter
> > discussed here, or is this a good idea?
> 
> SGML was explicitly designed to allow this and has features to do this
> sort of thing for you. A full SGML parser can interpret your "=" symbol
> and even your newlines as tags. This is very convenient for typists. I
> think that for novice users it will probably be quite confusing,
> however, because people are used to all SGML markup being in clearly
> marked tags, not in ordinary-looking characters. Also to parse something
> like this in Python we would either have to complicate sgmllib or
> introduce another layer of parsing.

Shortly:
1. What's wrong with introducing another layer of parsing?
2. I have reason to doubt that a mixed format will be confusing.

Extended:

ad 1.)  I haven't looked at SGML for years and forgot much of what I once 
   learned. I take it on your word that a full SGML parser can interpret 
   all kinds of non '<'..'>' embedded tokens. If we are using a python 
   WYSIWYG editor based on a DTD for these docs, the proposed mixed format 
   would require a complication of sgmllib. 
   I have the impression that the consensus is that we are to use a 
   non-wysiwyg editor for the time being, so this doesn't apply. 
   Thus I end up with the other option, which, fortunately, is what I was 
   thinking of in the first place: introduce another layer of parsing.
   I can think of no other penalty for this than that the computer works
   a little longer when doing the one-time job of converting the dirty-
   but-readable manual format into something standard tools can further
   process.

ad 2.)   I doubt the validity of your assessment of the degree to which a
   mixed format is "confusing".

  "This is very convenient for typists." -- Indeed, that's what Guido was 
  referring to when he started this thread.

  "I think that for novice users it will probably be quite confusing,
  however, because people are used to all SGML markup being in clearly
  marked tags, not in ordinary-looking characters." -- Given that we are
  talking about the python documentation here, I don't see who those
  "novice users" are, who are "used to all SGML markup being in clearly
  marked tags". 
  We all get along with not closing python statements with ';' and not
  enclosing blocks in '{'..'}'. I guess that those who write the
  documentation will catch up quickly if the documentation is to be written
  in some mixed format that looks good, even if it would take an advanced
  SGML parser to interpret it in a single step.

cjr

-- 
Case Roole <cjr@bound.xs4all.nl>


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From richardf@redbox.net  Sun Nov 16 10:04:28 1997
From: richardf@redbox.net (Richard Folwell)
Date: Sun, 16 Nov 1997 10:04:28 -0000
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <01BCF277.07931140.richardf@redbox.net>

I would like to make a small addition to the Python library documentation [1]. 
 It will not be real soon, say at the start of 1998.  What format should I 
produce it in?

I am familiar with both SGML and LaTex, but do not currently have either 
installed.  I have access to some tools (including FrameMaker + SGML) for SGML, 
but would have to get hold of TeX.  The working platform is NT.

Access to a Unix box for processing material would be possible, but would be 
both a real pain (I would have to set up a machine specially) and would make it 
almost impossible for me to interest any of my colleagues in the toolset.

I am interested in having a structured text system for my own use (at work we 
use Word, for all the usual wrong reasons).  It would be nice to use a system 
that was in regular use by other people for similar material.

Richard Folwell

[1] Some extra information for people writing sockets code under NT - the 
existing material more or less assumes that you are using Unix.


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Sun Nov 16 13:27:05 1997
From: papresco@technologist.com (Paul Prescod)
Date: Sun, 16 Nov 1997 08:27:05 -0500
Subject: [DOC-SIG] Library reference manual debate
References: <199711161008.KAA02434@axiom.bound.xs4all.nl>
Message-ID: <346EF4A9.57A7EC7A@technologist.com>

Case Roole wrote:
>   "I think that for novice users it will probably be quite confusing,
>   however, because people are used to all SGML markup being in clearly
>   marked tags, not in ordinary-looking characters." -- Given that we are
>   talking about the python documentation here, I don't see who those
>   "novice users" are, who are "used to all SGML markup being in clearly
>   marked tags".

Anybody familiar with HTML (in other words, almost everybody). I'm not
dead-set against the idea. If it will help the SGML solution to be more
palatable, then let's do it. I just usually try to avoid inventing my
own language because inevitably some tools (e.g. emacs psgml,
FrameMaker+SGML) will not support it properly, and I have to add more
transformation layers to my publishing process. I find that this is
usually not worth the few keystrokes saved, but intelligent people can
differ on that issue. 

My biggest concern would be that these extra layers would be construed
as "extra SGML complications" whereas TIM, having no real popularity at
all, can be extended in an ad hoc manner and thus could be seen to be
more "flexible" than SGML. By that argument, a language I invent
tomorrow would be more "flexible" than Python because it has no
installed base and thus I can change it to be whatever I want. This
"flexibility" leads to an infinite number of contrived, incompatible
languages. So yes, I would rather byte the bullet and use SGML in this
way than invent Yet Another Markup Language (what are we up to, 30, 40
of them?).

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Sun Nov 16 13:36:21 1997
From: papresco@technologist.com (Paul Prescod)
Date: Sun, 16 Nov 1997 08:36:21 -0500
Subject: [DOC-SIG] Library reference manual debate
References: <199711161008.KAA02434@axiom.bound.xs4all.nl>
Message-ID: <346EF6D5.E3552998@technologist.com>

Case Roole wrote:
>   "I think that for novice users it will probably be quite confusing,
>   however, because people are used to all SGML markup being in clearly
>   marked tags, not in ordinary-looking characters." -- Given that we are
>   talking about the python documentation here, I don't see who those
>   "novice users" are, who are "used to all SGML markup being in clearly
>   marked tags".

Anybody familiar with HTML (in other words, almost everybody). 

I'm not dead-set against the idea. If it will help the SGML solution to
be more palatable, then let's do it. I just usually try to avoid
inventing my own delimiter language because inevitably some tools (e.g.
emacs psgml, perhaps FrameMaker+SGML) will not support it properly, and
I have to add more transformation layers to my publishing process. I
find that this is usually not worth the few keystrokes saved, but
intelligent people can differ on that issue. 

My biggest concern would be that these tool incompatibilities (or
partial compatibilitites) would be construed as "extra SGML
complications" whereas TIM, having no real popularity at all, can be
extended in an ad hoc manner and thus could be seen to be more
"flexible" than SGML. By that argument, a language I invent tomorrow
would be more "flexible" than Python because it has no installed base
and thus I can change it to be whatever I want, but lose the support of
a community and a set of existing tools. This "flexibility" leads to an
infinite number of contrived, incompatible languages. So yes, I would
rather byte the bullet and invent our own delimiter conventions within
SGML rather than invent Yet Another Markup Language. 

But just be aware that it will probably cost us in tool compatibility at
some point, and force us to do some extra transformations to a simpler
SGML subset.

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Sun Nov 16 13:39:57 1997
From: papresco@technologist.com (Paul Prescod)
Date: Sun, 16 Nov 1997 08:39:57 -0500
Subject: [DOC-SIG] Library reference manual debate
References: <01BCF277.07931140.richardf@redbox.net>
Message-ID: <346EF7AD.E165401C@technologist.com>

Richard Folwell wrote:
> 
> I would like to make a small addition to the Python library documentation [1].
>  It will not be real soon, say at the start of 1998.  What format should I
> produce it in?

I don't think anybody knows yet (well, maybe Guido).
 
> I am familiar with both SGML and LaTex, but do not currently have either
> installed.  I have access to some tools (including FrameMaker + SGML) for SGML,
> but would have to get hold of TeX.  The working platform is NT.
> 
> Access to a Unix box for processing material would be possible, but would be
> both a real pain (I would have to set up a machine specially) and would make it
> almost impossible for me to interest any of my colleagues in the toolset.
> 
> I am interested in having a structured text system for my own use (at work we
> use Word, for all the usual wrong reasons).  It would be nice to use a system
> that was in regular use by other people for similar material.

This all sounds like a vote for SGML to me. :)

 * Works on NT
 * Easy to install (just unzip Jade and our stylesheet package)
 * Structured text
 * In regular use by other people

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From skip@calendar.com (Skip Montanaro)  Sun Nov 16 13:46:28 1997
From: skip@calendar.com (Skip Montanaro) (Skip Montanaro)
Date: Sun, 16 Nov 1997 08:46:28 -0500 (EST)
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <346EF7AD.E165401C@technologist.com>
References: <01BCF277.07931140.richardf@redbox.net>
 <346EF7AD.E165401C@technologist.com>
Message-ID: <199711161346.IAA11571@dolphin.automatrix.com>


> I would like to make a small addition to the Python library documentation
> [1].  It will not be real soon, say at the start of 1998.  What format
> should I produce it in?

Sorry I missed this before.  I zapped an entire chain in this thread without
reading them.  (Gotta get through my mail somehow...) I only noticed it as a
quote in a later message.

You should most definitely use LaTeX.  In the .../Doc directory is a
template (libtemplate.tex) for documenting an individual module.  Its
comments are quite clear, so it's pretty easy to get things right.  Even if
you don't have direct access to LaTeX to check your work, I'm sure Guido or
others who do would much rather start with your rough input than a blank
template.

Skip Montanaro    | Musi-Cal: http://concerts.calendar.com/
skip@calendar.com | Python: http://www.python.org/
(518)372-5583     | XEmacs: http://www.automatrix.com/~skip/xemacs/tip.html


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Sun Nov 16 15:54:00 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Sun, 16 Nov 1997 10:54:00 -0500
Subject: [DOC-SIG] What I don't like about SGML
Message-ID: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>

Here's the background of my dislike for SGML.  To confine this
highly flammable material :-), I'm spawning another thread.

First, while SGML may have been standardized in the swinging '80s, it
definitely has its roots in the '70s -- it takes many years to become
an international standard (look at C++!), and it started its life, as
"GML", long before standardization started.  Undoubtedly some of the
worse features in SGML were designed to be backwards compatible
(again, very much like C++...).

I am well aware that HTML is SGML conformant since HTML 2.0, and this
is precisely the reason for my concern.

99.9% of the time, HTML is parsed by relatively simple handwritten
parsers, not by generic SGML scanners.  There are lots of programs out
there that have to parse HTML -- preprocessors, web browsers, web
spiders, etc.  Why don't these just link to an existing SGML scanner?
Because SGML scanners are *huge*.  They need to be big to scan generic
SGML, which is a very complex language.  But most of this power isn't
needed to scan HTML, so people roll their own parser.

Before HTML had a version number, I wrote an HTML scanner in Python.
It was very simple.  Look for < or </ followed by a letter, then scan
up to a > character, etc.  HTML was simple to scan by design: Tim
Berners-Lee wanted HTML and HTTP to be so simple that almost anybody
could write programs that would immediately interoperate with the rest
of the web as it then existed.  There is no doubt that this is the
reason that the web took off at all.

But Berners-Lee made one mistake: he made HTML look a bit like SGML
(which he had seen once or twice from a distance :-).  Almost
immediately HTML was targeted by the SGML lobby for full compliance.
Here's what was added; all of this made my parser much more
complicated than I think it ought to be (look at how complicated
sgmllib.py is).  Note that most of what was added doesn't add
functionality.  In one or two cases it even takes away functionality!
It just complicates the scanning process in order to be compatible
with the extremely complicated scanning rules designed for SGML on
punched cards in the 70s.

    - A second special character '&' for entity references (original HTML
    used <lt> to escape "<").

    - Character references like &#32; or &#SPACE;.

    - Comments in the form of <!--.....-->, truly the most atrocious
    comment convention invented (and I believe it's worse -- officially,
    "--" may not occur inside a comment but "-- --" may, or something like
    that; but who cares, as almost no handwritten parser seems to get this
    right).

    - Special stuff to be ignored, starting with <!...>, where it is
    tricky to determine what the end is (since sometimes "<" or ">" may
    occur inside.

    - Special stuff to be ignored, starting with <?...>.

    - Short tags, <word/.../, which are still mostly outlawed because of
    compatibility reasons with older HTML processors, but which have to be
    recognized if you want to clame the elusive "full compliance".

    - It is not possible to turn off processing completely.  There used to
    be an HTML tag <LISTING> (?) which switched to literal copying of the
    text until </LISTING> was found.  This is impossible to do in SGML --
    the best you can do is to switch to literal mode until </ followed by
    a letter is seen, and you can't turn off &ref; processing either.
    Of course, with a handwritten parser it is no problem to switch to a
    mode that scans for </LISTING> exclusively...

    - Why do I have to put quotes around the URL in <A
    HREF="http://www.python.org"> ???

    - Other restrictions on what you can do with attributes; apparently
    there's a semantic rule that says that if two unrelated tags have an
    attribute with the same name, it must have the same "type".

    - A content model, which nobody asked for, and which few people check
    for, but which still allows HTML purists to tell you that your HTML
    page is "non-conformant" when you place an <H4> heading inside a <LI>
    list item (okay, so I made that up).

    - Probably a few other things that nobody asked for, such as the
    DTD declaration and SGML's approach to character sets (which is
    probably broken -- I believe there is a way to switch character
    sets in mid-stream...).

Of course, SGML aficionados will claim that all this was necessary so
that HTML could be processed with SGML, the most powerful and flexible
test processing mechanism available.  However, 99% of all HTML written
will never be processed by SGML; it is intended for throw-away
content.  Serious SGML users have two other recourses available to
them:

(1) Write everything in SGML and generate HTML from that; I believe
Jade can do this.

(2) Write a simple HTML scanner and convert it to SGML, by hook or by
crook.  I believe this is being done too.

So my claim remains that the requirement of SGML conformance is for
99% just a nuisance for parser writers.  Of course I'm biased, since
I'm a parser writer myself...  So see for yourself what you think of
this argument.

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From fredrik@pythonware.com  Sun Nov 16 16:19:57 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 16 Nov 1997 17:19:57 +0100
Subject: [DOC-SIG] What I don't like about SGML
Message-ID: <9711161627.AA02002@arnold.image.ivab.se>

> So my claim remains that the requirement of SGML conformance is for
> 99% just a nuisance for parser writers.

Isn't this the reason they've developed XML?  To come up with
a small and simple subset, so that anyone writing an application
can get things right?  (Not that they need to, really.  It seems as
if all major environments will include built-in parsers before long.
And if you need your own, there's plenty of free implementations
to chose from...)

> Of course I'm biased, since I'm a parser writer myself...  So see for
> yourself what you think of this argument.

FWIW, I've had similar experiences with scripting languages...

I started using scripting languages to glue things together in the
early eighties, and developed about a dozen languages of various
flavours. They all had serious limitations, mainly because there
was a lot of stuff that would have taken a lot of effort to get right,
or would have turned out way too slow (you cannot look names all
the time, can you?), or bloated. Finally, I've stumbled upon Python,
and realized that now I never had to write another scriping language,
since someone else had already created something powerful enough
for all my needs, and provided a great implementation for free...

And by some odd reason, I've just experienced the same thing with
text markup languages...  Instead of spending more time on edroff
and all the other pod-like stuff I've invented through the years, I
decided to throw them all out and go for SGML/XML, since someone
else had already created something powerful enough for all my needs,
and provided a great implementation for free... (www.jclark.com)

Cheers /F

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Sun Nov 16 16:27:43 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Sun, 16 Nov 1997 11:27:43 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: Your message of "Sun, 16 Nov 1997 08:36:21 EST."
 <346EF6D5.E3552998@technologist.com>
References: <199711161008.KAA02434@axiom.bound.xs4all.nl>
 <346EF6D5.E3552998@technologist.com>
Message-ID: <199711161627.LAA20965@eric.CNRI.Reston.Va.US>

Paul Prescod:
> My biggest concern would be that these tool incompatibilities (or
> partial compatibilitites) would be construed as "extra SGML
> complications" whereas TIM, having no real popularity at all, can be
> extended in an ad hoc manner and thus could be seen to be more
> "flexible" than SGML. By that argument, a language I invent tomorrow
> would be more "flexible" than Python because it has no installed base
> and thus I can change it to be whatever I want, but lose the support of
> a community and a set of existing tools. This "flexibility" leads to an
> infinite number of contrived, incompatible languages. So yes, I would
> rather byte the bullet and invent our own delimiter conventions within
> SGML rather than invent Yet Another Markup Language. 
> 
> But just be aware that it will probably cost us in tool compatibility at
> some point, and force us to do some extra transformations to a simpler
> SGML subset.

Okay, now we're talking.  The issue of layering tools is real.  I
expect that no matter which way we go, we will have to craft some
tools of our own.  I'm using latex now, and the tools I have crafted
so far are in myformat.sty.  In a sense, this is equivalent to a DTD
extension in SGML plus a style sheet.  When using TIM, the same thing
is done using a macro file.

Let me try to explain once more why I am hesitant to adopting SGML
(apart from my hang-ups about the lexer, which I discuss in a separate
thread -- they aren't particularly relevant).

I believe that part of Python's success lies in the fact that it has
few dependencies on other tools.  For example, it's written in C
rather than C++, and in fact until very recently I made sure that it
was compilable with a K&R C compiler as well as with a Standard C
compiler.  What's the advantage of C over C++?  When I started Python
as a mostly Unix tool, C++ compilers were still under heavy
development.  I expected that many prospective users of the language
would not have a compatible C++ compiler already installed on their
system, and I expected that having to find one that was compatible
with their hardware and O/S would be enough of a deterrent that they
would never use Python unless they were *very* motivated.  So I used a
lowest-common-denominator language, K&R C, which at the time came
bundled with every Unix version.  I suppose that in 1997 the
availability of C++ compilers is no longer a problem (for example on
the Windows and Mac platforms all C compilers are really C++
compilers) -- but my choice for C was definitely the right one until
recently.  A second reason was programmer availability -- again, until
recently, if I had been using C++, it would have been harder for Joe
Average to change a few lines in the Python source to fix a bug and
to send me the diffs.

I am worried that SGML tools are still in a state similar to that of
C++ eight years ago: they exist, but they don't come bundled with any
O/S, and it takes time to track down the right tools for your platform
and then to install them, and you may or may not be successful
depending on what other software you have available.  I'm kind of
worried too because the only tool that is used as an existence proof
(Jade) seems to be a one-person project.  And of course the XML tools
are still almost completely in the vaporware category.

It has been mentioned that TIM is in the same situation: it's not
widely known or used.  However, the one big difference is that all of
TIM consists of three scripts, one of which is already written in
Python (and the other ones could easily be rewritten in Python).  So
instead of adding a dependency on a external tools, as with the
adoption of SGML, I would become *independent* of external tools when
I were to adopt TIM.  (This is exactly the same reason why the Perl
people did their own, POD.)

I believe that using an adaptation of TIM, it will be possible to
generate HTML *without downloading any additional tools*.  I think
this is a huge win, as HTML is all that's needed to preview one's
changes to the manual.  To generate PostScript will still require TeX
(and LaTeX and texinfo), but since the existing solution also requires
that, things don't get worse, and of course those who have a need to
use SGML can contribute a translator from TIM to SGML (or, more
likely, to XML, once the XML vaporware solidifies into software).
(Besides, it seems that to get PostScript out of SGML one generally
*also* has to go through TeX, at least on Unix.)

Note that adoption of SGML doesn't mean that we *don't* have to craft
our own tools -- we'll have to come up with a set of definitions (a
DTD extension, is that the right term?) so we can conveniently format
manual entries for functions, classes and methods with default
argument values, keyword arguments, specify argument types, and so on
(which is what the myformat.sty macros are about -- it defines
convenient ways to enter the information about function and method
prototypes).  I expect that this particular effort will be about the
same, whether we're using TIM or SGML.

The difference will be that when using TIM, we're encouraging Python
hackers to extend our tool set, while when using SGML, we're
encouraging SGML hackers to extend our tool set.  I won't try to guess
which type of hacker is predominant in the world at large; but in the
Python community, I'd say there's no doubt :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From fredrik@pythonware.com  Sun Nov 16 16:59:30 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 16 Nov 1997 17:59:30 +0100
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <9711161700.AA31749@arnold.image.ivab.se>

> The difference will be that when using TIM, we're encouraging Python
> hackers to extend our tool set, while when using SGML, we're
> encouraging SGML hackers to extend our tool set.  I won't try to guess
> which type of hacker is predominant in the world at large; but in the
> Python community, I'd say there's no doubt :-)

Well, I'd guess that few Python hackers work on text markup languages
and formatting engines, and of those who to do, quite a few seems to
be working with SGML:

	http://www.w3.org/XML/9705/hacking
	http://www.sil.org/sgml/mcgrathParseDesc.html

and so on...

Since I'm working on SGML/XML for our company's projects, at least I
would much rather contribute to an SGML/XML based effort, than to
hack on TIM. Others milage may vary, as usual ;-)

Cheers	/F

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Sun Nov 16 18:54:52 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Sun, 16 Nov 1997 13:54:52 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: Your message of "Sun, 16 Nov 1997 17:59:30 +0100."
 <9711161700.AA31749@arnold.image.ivab.se>
References: <9711161700.AA31749@arnold.image.ivab.se>
Message-ID: <199711161854.NAA21167@eric.CNRI.Reston.Va.US>

Fredrik Lundh:
> Since I'm working on SGML/XML for our company's projects, at least I
> would much rather contribute to an SGML/XML based effort, than to
> hack on TIM. Others milage may vary, as usual ;-)

Fredrik, I'm afraid that you're already overcommitted -- I'd hate to
see the schedule for your book jeopardized.  (I think it is your
highest priority from the Python community's point of view.)

Otherwise, I'd challenge you to get started -- I'm sure you'd do a
great job.  Here's the challenge anyway -- maybe someone else can pick
it up.  I'm tired of hearing what *I* should do.  I've already hinted
on what I *would* do if I had to do it.  I'm more interested in
hearing from people who have done something that I (and the rest of
the Python community) can use.  "Use SGML" is not a productive
approach; "this is what I did using SGML" is.

What would be needed, at least at the proof of concept level, is a
tool that does the one-time conversion of a library manual section or
chapter to SGML, plus entries in Doc/Makefile that automatically
produce PostScript and HTML from the SGML.  Since these are the output
formats that are currently supported, it makes sense to require that
they are both supported by any proposed new system before it is
judged.  Knowing in the abstract that SGML can be converted to HTML
and PostScript isn't enough -- I want to see the generated HTML and
PostScript so that I (and others) can judge how good it is and what
still needs to be done.

As a concrete test, the Python library manual is full of sections like
this one:

    \begin{funcdesc}{sub}{pattern\, repl\, string\optional{, count=0}}
    Return the string obtained by replacing the leftmost non-overlapping
    occurrences of \var{pattern} in \var{string} by the replacement
    \var{repl}, which can be a string or the function that returns a
    string.  If the pattern isn't found, \var{string} is returned
    unchanged. The pattern may be a string or a regexp object; if you need
    to specify regular expression flags, you must use a regexp object, or
    use embedded modifiers in a pattern string; e.g.
    %
    \bcode\begin{verbatim}
    sub("(?i)b+", "x", "bbbb BBBB") returns 'x x'.
    \end{verbatim}\ecode
    %
    The optional argument \var{count} is the maximum number of pattern
    occurrences to be replaced; count must be a non-negative integer, and
    the default value of 0 means to replace all occurrences.

    Empty matches for the pattern are replaced only when not adjacent to a
    previous match, so \code{sub('x*', '-', 'abc')} returns '-a-b-c-'.
    \end{funcdesc}

How should this be translated to SGML?  Which DTD should be used?  I'm
not particularly happy with the way the argument list has to be
formatted in LaTeX, especially when optional or keyword arguments are
present -- can SGML do better?

For comparison, here's an example of a complex function description in
TIM:

    @deffn Function ORB_init (@metavar{argv}=(), @metavar{orb_id}='ilu')
    @ftindex CORBA.ORB_init (Python LSR function)

    Returns an instance of @class{@Python{CORBA.ORB}} with the specified
    @metavar{orb_id} (currently only the ORB ID @Python{'ilu'} is
    supported).  The arguments which may be passed in via @metavar{argv}
    are ignored.
    @end deffn

Note that one feature I like is that the LaTeX {funcdesc} environment
automatically creates the index entry for the function; it combines it
with some information provided earlier in the file:

    \renewcommand{\indexsubitem}{(in module re)}

I see that this is done manually in TIM (although I'm not sure why).

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From fredrik@pythonware.com  Sun Nov 16 20:08:11 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 16 Nov 1997 21:08:11 +0100
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <9711162009.AA20858@arnold.image.ivab.se>

> Otherwise, I'd challenge you to get started -- I'm sure you'd do a
> great job.

Well, the thing is that I have to do this anyway (not that bad,
since I get paid to do it); if I can get you on the "right track",
I might be able to contribute without having to work extra
shifts ;-)

> I'm more interested in hearing from people who have done some-
> thing that I (and the rest of the Python community) can use.  "Use
> SGML" is not a productive approach; "this is what I did using SGML"
> is.

Okay, folks.  Time to:

1. Settle on a DTD.  Can we use DocBook as is?  What extensions
   are needed?  (Paul?  Fred?)

2. Write a small "howto" document; maybe just a sample page
   showing how to format a typical libref chapter.   maybe also
   a "howto" on how to efficiently use emacs' SGML mode.

3. Hack a customized Tex to SGML converter (anyone has any
   code for this?)

4. (initially) use Jade for the initial conversions to RTF/PS and
   HTML, using Norm Walsh's DSSSL stylesheets (Paul?)

Now, since learning scheme (DSSSL) is more than I have time
for, I'll also propose the following projects:

5. write an SGML to XML converter using Grail's SGMLparser
   (in the meantime, we can use James Clark's "sx" tool)
6. write an XML parser (at least a tokenizer) that some day
   could be included in the standard Python distribution (almost
   done!)
7. write an XML to HTML tool based on (6) and a "Python style
   sheet" (almost done!)
8. write an XML to PostScript tool based on (6), the printer
   formatter from edroff, and PIL's PSDraw (or maybe we could
   use html2ps?)
9. write an XSL stylesheet for XML-aware browsers.
10. etc.
11. etc.
12. etc.

Or maybe fuse 5 and 6.  But dealing with XML is much easier;
an XML parser written in C could be added to Python without
anyone noticing...

And given modules 5-7, we'll end up with the 100% pure python
solution that I guess we all would prefer...

So, what do you think?

Cheers	/F


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From scott@chronis.icgroup.com  Mon Nov 17 02:56:36 1997
From: scott@chronis.icgroup.com (Scott)
Date: Sun, 16 Nov 1997 21:56:36 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <9711162009.AA20858@arnold.image.ivab.se>; from Fredrik Lundh on Sun, Nov 16, 1997 at 09:08:11PM +0100
References: <9711162009.AA20858@arnold.image.ivab.se>
Message-ID: <19971116215636.23680@chronis.icgroup.com>


Seems like there's a lot of great works getting underway here.  I'm certain
they would/will add alot to what python can do.

As far as my own opinion for what what format the library reference should
take,  I have two major concerns.  First is that the first beta release of
python 1.5 is not significantly delayed for whatever is decided should be done
to change the current docs.  There's alot of work in 1.4 out there that would
benefit greatly from a public 1.5 release (and I'm tired of wanting to write
code that will only work in 1.5 when in production it will have to in 1.4).

Second is that html be really easy to access.  For the library reference in
particular, I find that I'd much rather click search a couple of times than
page through sheet after sheet of hardcopy.

Finally, though I'm not all that familiar with the particulars of parsing
tim/sgml/xml/latex, I would like to point out that I've made some progress 
towards easily producing generally efficient parsers in python (Lex/Yacc/Bison
style). Should anyone working on any python documentation projects want such a
tool before it's ready for its first public release, there is pre release info
on the string sig and under
http://starship.skyport.net/crew/scott/projects.html.  Using the prelease code
toward this end, or offering suggestions as to how it could better accomodate
this end is most welcome.

scott   


On Sun, Nov 16, 1997 at 09:08:11PM +0100, Fredrik Lundh wrote:
| > Otherwise, I'd challenge you to get started -- I'm sure you'd do a
| > great job.
| 
| Well, the thing is that I have to do this anyway (not that bad,
| since I get paid to do it); if I can get you on the "right track",
| I might be able to contribute without having to work extra
| shifts ;-)
| 
| > I'm more interested in hearing from people who have done some-
| > thing that I (and the rest of the Python community) can use.  "Use
| > SGML" is not a productive approach; "this is what I did using SGML"
| > is.
| 
| Okay, folks.  Time to:
| 
| 1. Settle on a DTD.  Can we use DocBook as is?  What extensions
|    are needed?  (Paul?  Fred?)
| 
| 2. Write a small "howto" document; maybe just a sample page
|    showing how to format a typical libref chapter.   maybe also
|    a "howto" on how to efficiently use emacs' SGML mode.
| 
| 3. Hack a customized Tex to SGML converter (anyone has any
|    code for this?)
| 
| 4. (initially) use Jade for the initial conversions to RTF/PS and
|    HTML, using Norm Walsh's DSSSL stylesheets (Paul?)
| 
| Now, since learning scheme (DSSSL) is more than I have time
| for, I'll also propose the following projects:
| 
| 5. write an SGML to XML converter using Grail's SGMLparser
|    (in the meantime, we can use James Clark's "sx" tool)
| 6. write an XML parser (at least a tokenizer) that some day
|    could be included in the standard Python distribution (almost
|    done!)
| 7. write an XML to HTML tool based on (6) and a "Python style
|    sheet" (almost done!)
| 8. write an XML to PostScript tool based on (6), the printer
|    formatter from edroff, and PIL's PSDraw (or maybe we could
|    use html2ps?)
| 9. write an XSL stylesheet for XML-aware browsers.
| 10. etc.
| 11. etc.
| 12. etc.
| 
| Or maybe fuse 5 and 6.  But dealing with XML is much easier;
| an XML parser written in C could be added to Python without
| anyone noticing...
| 
| And given modules 5-7, we'll end up with the 100% pure python
| solution that I guess we all would prefer...
| 
| So, what do you think?
| 
| Cheers	/F
| 
| 
| _______________
| DOC-SIG  - SIG for the Python Documentation Project
| 
| send messages to: doc-sig@python.org
| administrivia to: doc-sig-request@python.org
| _______________

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Mon Nov 17 02:57:38 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Sun, 16 Nov 1997 21:57:38 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: Your message of "Sun, 16 Nov 1997 21:08:11 +0100."
 <9711162009.AA20858@arnold.image.ivab.se>
References: <9711162009.AA20858@arnold.image.ivab.se>
Message-ID: <199711170257.VAA21457@eric.CNRI.Reston.Va.US>

> Well, the thing is that I have to do this anyway (not that bad,
> since I get paid to do it); if I can get you on the "right track",
> I might be able to contribute without having to work extra
> shifts ;-)

OK, you have the green light!  (While you're at it, could you design a
set of macros for api.tex too?  That's my next big project at the
moment.)

> 6. write an XML parser (at least a tokenizer) that some day
>    could be included in the standard Python distribution (almost
>    done!)

Sjoerd Mullender has written one already.  Sjoerd, would you mind
announcing your xml parser somewhere?

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov 17 04:49:30 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Sun, 16 Nov 1997 23:49:30 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <9711162009.AA20858@arnold.image.ivab.se>
References: <9711162009.AA20858@arnold.image.ivab.se>
Message-ID: <199711170449.XAA29856@weyr.cnri.reston.va.us>


Fredrik Lundh writes:
 > 1. Settle on a DTD.  Can we use DocBook as is?  What extensions
 >    are needed?  (Paul?  Fred?)

  Unless Paul has some specific objections, I think we should at least 
start with the standard docbook DTD.  We can adjust if we find
problems with toolsets or complexity.

 > 3. Hack a customized Tex to SGML converter (anyone has any
 >    code for this?)

  I can work on this, as I've whacked around in the old partparse.py
somewhat.  I'll look for other alternatives before I start whacking on 
it again.

 > 5. write an SGML to XML converter using Grail's SGMLparser
 >    (in the meantime, we can use James Clark's "sx" tool)

  This shouldn't be too onerous.

 > 8. write an XML to PostScript tool based on (6), the printer
 >    formatter from edroff, and PIL's PSDraw (or maybe we could
 >    use html2ps?)

  html2ps.py would probably be a good approach if we want to use
Python-only tools, though I suspect a jade->TeX->dvips conversion
would look better.  There's still a lot of things html2ps doesn't
support, and it's already quite slow.  As much as I think I can
improve it, it's just not there for this kind of thing.  (And I do
have ways to convert serious multi-page HTML documents using html2ps,
with a lot of the frills.  Trust me, it's just not in there to replace 
TeX for formatting these things.)


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov 17 04:57:19 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Sun, 16 Nov 1997 23:57:19 -0500
Subject: [DOC-SIG] What I don't like about SGML
In-Reply-To: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
References: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
Message-ID: <199711170457.XAA29863@weyr.cnri.reston.va.us>


Guido van Rossum writes:
 > First, while SGML may have been standardized in the swinging '80s, it
 > definitely has its roots in the '70s -- it takes many years to become
 > an international standard (look at C++!), and it started its life, as
 > "GML", long before standardization started.  Undoubtedly some of the
 > worse features in SGML were designed to be backwards compatible

  Have you used GML?  I have.  It was probably nice when it was new,
but certainly was showing problems by the time I used it.  Script/VS
(the processor I used) also allowed "control words" which looked a lot 
like troff dot-commands.  I ended up using a lot of these because the
mechanisms for defining new logical markup were very poorly documented 
as far as I could tell.  I had to define macros on top of the
Script/VS control words.
  The SGML is see now has definately evolved a long way from those
roots, though the better aspects of GML are still there (structure).
I don't think the GML background of SGML can be meaningfully held up
as a problem with SGML; I think Goldfarb learned a lot from GML's
failures when by the time SGML was defined.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Mon Nov 17 08:59:49 1997
From: papresco@technologist.com (Paul Prescod)
Date: Mon, 17 Nov 1997 03:59:49 -0500
Subject: [DOC-SIG] Library reference manual debate
References: <9711162009.AA20858@arnold.image.ivab.se> <199711170449.XAA29856@weyr.cnri.reston.va.us>
Message-ID: <34700785.E8472EE6@technologist.com>

Fred L. Drake wrote:
> 
> Fredrik Lundh writes:
>  > 1. Settle on a DTD.  Can we use DocBook as is?  What extensions
>  >    are needed?  (Paul?  Fred?)
> 
>   Unless Paul has some specific objections, I think we should at least
> start with the standard docbook DTD.  We can adjust if we find
> problems with toolsets or complexity.

My only concern is with trying to do too much at once. If we can just
get the book into some SGML variant, no matter how bizarre, then our
life becomes easier because we can use either Python or Jade for further
transformations. Once we are there, we can aim for DocBook.
 
>   I can work on this, as I've whacked around in the old partparse.py
> somewhat.  I'll look for other alternatives before I start whacking on
> it again.

Right, this again seems to argue in favour of doing the TeX->SGML step
separate from the SGML->DocBook step. You can do TeX->SGML with no help
and without consulting the DocBook DTD. Your only constraint is "don't
lose information." I can do the SGML->DocBook as part of a collaborative
project with discussion on the features and markup we need.
 
 Pual Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Sjoerd.Mullender@cwi.nl  Mon Nov 17 09:58:20 1997
From: Sjoerd.Mullender@cwi.nl (Sjoerd Mullender)
Date: Mon, 17 Nov 1997 10:58:20 +0100
Subject: [DOC-SIG] ANNOUNCE: XML parser library
Message-ID: <UTC199711170958.KAA03747.sjoerd@bireme.cwi.nl>

I have written an XML (eXtensible Markup Language) parser module for
Python.  This module is derived from the SGML parser (sgmllib) and has 
a similar flavor.

xmllib is available from the following sites:

ftp://ftp.cwi.nl/pub/sjoerd/xmllib.tar.gz
http://www.cwi.nl/ftp/sjoerd/xmllib.tar.gz
ftp://ftp.starship.skyport.net/pub/crew/sjoerd/xmllib.tar.gz

In all places there is also a file xmllib.README (which is also part
of the distribution).

Since the module uses the new re module, it will only work if you have 
that already.  The re module is standard in Python 1.5alpha.

For information on XML see <URL:http://www.w3.org/TR/WD-xml>.

-- Sjoerd Mullender <Sjoerd.Mullender@cwi.nl>
   <URL:http://www.cwi.nl/~sjoerd/>

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From skaller@zip.com.au  Mon Nov 17 15:13:30 1997
From: skaller@zip.com.au (John Skaller)
Date: Tue, 18 Nov 1997 02:13:30 +1100
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <1.5.4.32.19971117151330.0092f458@zip.com.au>

It would seem to me the first step to improve documentation
would be to create a mechanism for people to submit and
retrieve it.

My opinion is that fixing a format is the best way to exclude
most potential submitters. But if a format has to be
picked, it had better be ordinary old HTML, so it can
be put up on a website and used immediately by everyone.

The tree and subtrees should be available compressed.
That can be done automatically by some newer ftp servers.
Not everyone is online all the time!

I want to click on a link, download the whole
thing, unpack it into my web server, add a link to my
home page, and I get a mirror.

HTML is little use for typesetting books, but individual
authors are NOT going to agree on any standard for print
media. They're going to use whatever method they have that
works and their publisher is happy with.

I need to convert my "text-for-printmedia"
into something people can browse. So I'm trying to get
LaTeX2HTML running. It complains about my fancy packages.
It can somehow take "snapshots" -- by magic it seems to me --
of bits it can't understand, but this will only 
work on MY system. So the only person who can convert
my source to HTML is ME.

I really _have_ to get that working. I can't write HTML
at all.

---------------------------------------------------------------

I can envisage a much more sophisticated system, which
accepts all kind of documents and converts them
to other formats as required.

Where are we going to get programmers who can do this
work without the documentation for them to learn Python?

WHO is going to convert submitted LaTeX to HTML?
So, I write a doc using Guido's latex style.
How long until someone converts it and posts it
to the website?

To start off, why not accept documents in
_several_ formats. HTML, Postscript, dvi, and perhaps
a Guido-restricted LaTeX -- assuming Guido is
willing to do the conversion. No? Then we can't
accept that format.
-------------------------------------------------------
John Skaller    email: skaller@zip.com.au
		http://www.zip.com.au/~skaller
		phone: 61-2-6600850
		snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Edward Welbourne <eddyw@lsl.co.uk>  Mon Nov 17 14:09:13 1997
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Mon, 17 Nov 1997 14:09:13 GMT
Subject: [DOC-SIG] doc strings
Message-ID: <9711171409.AA27032@lslr6g.lsl.co.uk>

OK, I accept that it's better to stick with the gendoc `structured text'
approach rather than using *ML, TIM or anything else in python doc
folds, even if that does mean I've now got a fair slice of HTML to
retro-convert in existing code.  I'll come back (if I remember) to how
we might be able to improve on this ...

First off, I want the doc string extractor to be able to decipher `type'
and `default' information from the centre-piece of every doc string I've
written - the argument list.  Example (in gendoc's form):

Arguments:
 file -- file-name string.  The name of the file to open.

 [mode] -- ('r') I/O mode string.  ...

 [bufsiz] -- (-1) integer.  Buffer size to use for I/O, if bufsize &gt;
 1: a value of 1 requests line buffering, 0 requests unbuffered I/O and
 any negative value requests the implementation's default.

It might be worth recognising the word `argument' or `arguments', along
similar lines to `example', if we can think of a common format for
`default and type' information, possibly exploiting the fact that this
will take the form of the list item's first `sentence'.  If default is
there, it's the first thing after -- and is enclosed in ().  Next comes
type information, up to `end of sentence'.  The rest of the paragraph
might be worth typesetting as a separate paragraph in the <DD>, as if
(in the HTML output to be generated from the above)

<P>Arguments:</P><DL>
<DT>file<DD> file-name string. <P> The name of the file to
open. </P></DD>
...
</DL>

Note that if -- is followed by (...) and ... happens to contain a match
to the pattern being used to detect `end of sentence', it shouldn't be
counted as such because it's inside the default spec.


Here's another sample of a docstring (gde.process.execute.__doc__,
wrapping os.execv and os.execve up) to illustrate use of gendoc's
`structured text' for the uninitiated.  Note that the method's name and
the fact that it's a method (of a class called gde._Process, as it
happens, but it should be documented as a method of the value
gde.process) can be extracted by the tools which crawl over the
namespace digging out the doc-strings and gluing them together.  Note
that the method also has a `self' argument which doesn't appear here.

"""Replaces the current process.

Required argument, file, is the pathname of a file, which must be
executable, to be executed.  The resulting process will replace the
current process.  Optional arguments:

 args -- ([]) list of strings.  The arguments to be passed to file.

 env -- (None) dictionary, mapping strings to strings.  If omitted (or
 given as None), the new process will run in the same environment as the
 old one; otherwise, env gives the environment in which the new process
 is to run.

*Does not return.*"""

	Eddy.

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Mon Nov 17 14:36:02 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Mon, 17 Nov 1997 09:36:02 -0500
Subject: [DOC-SIG] What I don't like about SGML
In-Reply-To: Your message of "Sun, 16 Nov 1997 23:57:19 EST."
 <199711170457.XAA29863@weyr.cnri.reston.va.us>
References: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
 <199711170457.XAA29863@weyr.cnri.reston.va.us>
Message-ID: <199711171436.JAA22253@eric.CNRI.Reston.Va.US>

Fred Drake:
>   Have you used GML?  I have.  It was probably nice when it was new,
> but certainly was showing problems by the time I used it.  Script/VS
> (the processor I used) also allowed "control words" which looked a lot 
> like troff dot-commands.  I ended up using a lot of these because the
> mechanisms for defining new logical markup were very poorly documented 
> as far as I could tell.  I had to define macros on top of the
> Script/VS control words.
>   The SGML is see now has definately evolved a long way from those
> roots, though the better aspects of GML are still there (structure).
> I don't think the GML background of SGML can be meaningfully held up
> as a problem with SGML; I think Goldfarb learned a lot from GML's
> failures when by the time SGML was defined.

Hm, I'm not sure if we're talking about the same GML then.  According
to Goldfarb's home page (http://www.sgmlsource.com/):

- For history buffs, some reliable papers on the early history of SGML
  and its precursor, GML. I invented SGML in 1974, and led the technical
  efforts of several hundred people for a dozen years that developed it
  into its present form as an International Standard. You can read some
  of that story in the SGML History Niche.

Anyway, this was just in response to Paul Prescod.  I claimed (and
still claim) that SGML's input methods have its roots in punched
cards.  Paul responded that it was standardized in 1986, when PCs were
common.  Goldfarb's remark indicates that SGML is much older than
that...

--Guido van Rossum (home page: http://www.python.org/~guido/)


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Edward Welbourne <eddyw@lsl.co.uk>  Mon Nov 17 14:45:46 1997
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Mon, 17 Nov 1997 14:45:46 GMT
Subject: [DOC-SIG] doc strings could be in a variety of formats
Message-ID: <9711171445.AA14694@lslr6g.lsl.co.uk>

I said earlier that ...
> I'll come back (if I remember) to how we might be able to improve on
> this ...

Doc strings are principally important to the author of the python code
in which they appear.  Their secondary importance is that they can be
extracted by a uniform toolset (in python) into, at least, HTML.

Imagine a standard set of classes which describe a common form into
which our doc strings are to be parsed by the common toolset.  We can
write simple parsers from a few variants on the doc-string format into
this internal form.  If a module or class sets its __docform__ tag to an
object with a .parse(string) method, the tools crawling the namespace to
extract docs can notice this and use that __docform__ as the parser for
the doc strings in the module or class.  The onus of supporting a new
doc-string format falls on those who depart from the fold, but they get
the option if they're prepared to take that effort.

Does gendoc contain classes which represent the parsed strings ?  Does
it provide such a __docform__ object which might serve as the default ?

With this sort of setup, those of us who like HTML can write our doc
strings in HTML (provided we're willing to write ourselves a parser for
it) for use in place of the gendoc one: as for HTML, so for any of the
myriad of doc forms out there in the world.  Furthermore, it shouldn't
be <EM>too</EM> hard for someone fed up (in interactive sessions) with
decoding some other contributor's doc strings to write something which
turns the parsed doc-strings back into their own preferred format of
doc-strings for display (chose your own __str__ method for the object
produced by a __docform__).

With a scheme like this, we can have our cake and eat it.
Please, someone, tell me what I've missed ;^>

	Eddy.

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Mon Nov 17 14:52:45 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Mon, 17 Nov 1997 09:52:45 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: Your message of "Mon, 17 Nov 1997 03:59:49 EST."
 <34700785.E8472EE6@technologist.com>
References: <9711162009.AA20858@arnold.image.ivab.se> <199711170449.XAA29856@weyr.cnri.reston.va.us>
 <34700785.E8472EE6@technologist.com>
Message-ID: <199711171452.JAA22293@eric.CNRI.Reston.Va.US>

Paul Prescod <papresco@technologist.com>:

> Right, this again seems to argue in favour of doing the TeX->SGML step
> separate from the SGML->DocBook step. You can do TeX->SGML with no help
> and without consulting the DocBook DTD. Your only constraint is "don't
> lose information." I can do the SGML->DocBook as part of a collaborative
> project with discussion on the features and markup we need.

What is the use of SGML without a DTD?  Can one still create HTML and
Postscript from it?

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov 17 14:55:05 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Mon, 17 Nov 1997 09:55:05 -0500
Subject: [DOC-SIG] What I don't like about SGML
In-Reply-To: <199711171436.JAA22253@eric.CNRI.Reston.Va.US>
References: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
 <199711170457.XAA29863@weyr.cnri.reston.va.us>
 <199711171436.JAA22253@eric.CNRI.Reston.Va.US>
Message-ID: <199711171455.JAA00324@weyr.cnri.reston.va.us>


Guido van Rossum writes:
 > Hm, I'm not sure if we're talking about the same GML then.  According
 > to Goldfarb's home page (http://www.sgmlsource.com/):

  This is the same one.  The machinery for defining processing
separately from the abstract markup was there, but you pretty much had 
to be an IBM insider to get enough information about how to use it.
That's why the Script/VS control words got used as much as they did.
I agree; the original format of the markup would have been better left 
on the punched cards.  But it was sufficient for me to write about 150 
pages of a software manual (user info. and configuration).  I had more 
problems dealing with XEdit than the markup itself, but too much of
the markup ended up being process-oriented than it should have been.
Very tedious stuff, indeed.
  A lot like LaTeX in many ways, but it's much easier to extend the
LaTeX markup than the GML markup.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov 17 15:02:39 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Mon, 17 Nov 1997 10:02:39 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711171452.JAA22293@eric.CNRI.Reston.Va.US>
References: <9711162009.AA20858@arnold.image.ivab.se>
 <199711170449.XAA29856@weyr.cnri.reston.va.us>
 <34700785.E8472EE6@technologist.com>
 <199711171452.JAA22293@eric.CNRI.Reston.Va.US>
Message-ID: <199711171502.KAA00350@weyr.cnri.reston.va.us>


Guido van Rossum writes:
 > What is the use of SGML without a DTD?  Can one still create HTML and
 > Postscript from it?

  There must be a DTD, even if it doesn't get written down.  (That's
often referred to as "well-formed" XML. ;)
  Paul is referring to a long-standing convention of converting
between document types in incremental steps, to allow each step to be
simple to implement and check.  It should not be too hard to do; my
main concern is that the DTDs for intermediate stages must be
sufficiently well understood that the conversions don't lose
information by accident.  This will probably mean, if the DTDs aren't
written and at least partially documented, that one person will handle 
the entire process until the target DTD is reached.  Perhaps this is
O.K., perhaps not.  Others should be able to repeat the process by
running the same sequence of scripts over the same input data.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Edward Welbourne <eddyw@lsl.co.uk>  Mon Nov 17 16:09:22 1997
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Mon, 17 Nov 1997 16:09:22 GMT
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711160451.XAA29095@weyr.cnri.reston.va.us>
References: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>
 <199711160451.XAA29095@weyr.cnri.reston.va.us>
Message-ID: <9711171609.AA35590@lslr6g.lsl.co.uk>

Guido:
>> ain't broken."  I personally have access to a working LaTeX
>> installation, the latex2html converter produces adequate HTML (I still
Fred:
>  It's out of date and should be updated, but does work for the Python 
> documentation.  I have found very reasonable LaTeX2e documents that
> can't be formatted correctly using the CNRI installation.

Yup, I used to be a TeXnician but it's so long since I've had a
non-fragile installation that I've given up on it.  I endure LaTeX for
legacy reasons, but I don't write in it except where I have to.
This is genuinely a major problem with LaTeX: once a site has got a few
hacked .sty files, you can forget about the portability of LaTeX.  The
entire TeX system is too brittle, for this it will die.
Pity: I still have a lot of fondness for it.
It was a programmer's documentation form.

	Eddy.

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov 17 16:26:19 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Mon, 17 Nov 1997 11:26:19 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711161854.NAA21167@eric.CNRI.Reston.Va.US>
References: <9711161700.AA31749@arnold.image.ivab.se>
 <199711161854.NAA21167@eric.CNRI.Reston.Va.US>
Message-ID: <199711171626.LAA00457@weyr.cnri.reston.va.us>


Guido van Rossum writes:
 > hearing from people who have done something that I (and the rest of
 > the Python community) can use.  "Use SGML" is not a productive
 > approach; "this is what I did using SGML" is.

Guido,
  I did want to comment on this.  So far, I haven't seen anyone write
that you should do the work to switch over to SGML or anything else.
I think that Paul and I, perhaps with additional collaborators if
anyone is interested and can squeeze out the time, can muster the
technical expertise to do the work.
  The issue is:  Are you willing to consider using an SGML/XML based
solution as the canonnical form for the documentation if handed to you 
on a silver platter and it meets the requirements?  Are you willing to 
help us review the requirements to be sure we aren't leaving anything
out that's in there now, or that really needs to be in there?  This is 
a question that needs to be answered.  From another message you wrote, 
this may be the case, but I'm not certain I didn't misinterpret.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov 17 16:33:31 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Mon, 17 Nov 1997 11:33:31 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711161627.LAA20965@eric.CNRI.Reston.Va.US>
References: <199711161008.KAA02434@axiom.bound.xs4all.nl>
 <346EF6D5.E3552998@technologist.com>
 <199711161627.LAA20965@eric.CNRI.Reston.Va.US>
Message-ID: <199711171633.LAA00494@weyr.cnri.reston.va.us>


Guido van Rossum writes:
 > worried too because the only tool that is used as an existence proof
 > (Jade) seems to be a one-person project.  And of course the XML tools
 > are still almost completely in the vaporware category.

  SP, the parser underlying Jade, is James Clark's second complete
SGML parser.  From the discussions in comp.text.sgml, I'd say it's
well regarded as a world-class piece of software which is being used
in commercial software as well as all sorts of ad-hoc applications.
Clark is also the author of groff, the GNU roff/troff tool.  I don't
think there's any reason to be concerned about the source of this
software.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Mon Nov 17 16:37:04 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Mon, 17 Nov 1997 11:37:04 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: Your message of "Mon, 17 Nov 1997 11:26:19 EST."
 <199711171626.LAA00457@weyr.cnri.reston.va.us>
References: <9711161700.AA31749@arnold.image.ivab.se> <199711161854.NAA21167@eric.CNRI.Reston.Va.US>
 <199711171626.LAA00457@weyr.cnri.reston.va.us>
Message-ID: <199711171637.LAA22663@eric.CNRI.Reston.Va.US>

Fred Drake:

> Guido,
>   I did want to comment on this.  So far, I haven't seen anyone write
> that you should do the work to switch over to SGML or anything else.
> I think that Paul and I, perhaps with additional collaborators if
> anyone is interested and can squeeze out the time, can muster the
> technical expertise to do the work.
>   The issue is:  Are you willing to consider using an SGML/XML based
> solution as the canonnical form for the documentation if handed to you 
> on a silver platter and it meets the requirements?  Are you willing to 
> help us review the requirements to be sure we aren't leaving anything
> out that's in there now, or that really needs to be in there?  This is 
> a question that needs to be answered.  From another message you wrote, 
> this may be the case, but I'm not certain I didn't misinterpret.

I am not rejecting SGML unseen.  If it gets handed to me on a silver
platter I will review it.  Until very recently I hadn't heard anyone
volunteer anything, just a lot of arguing (including my own :-).

This has changed.  I am still skeptical about how easy it will be for
Joe Random Contributor to contribute documentation (this means both
the input format and the tools to produce at least one HTML or
PostScript so they can review what they are contributing) so I think
that's where the pudding's proof will have to be.

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Mon Nov 17 16:58:46 1997
From: papresco@technologist.com (Paul Prescod)
Date: Mon, 17 Nov 1997 11:58:46 -0500
Subject: [DOC-SIG] docstrings for args
Message-ID: <347077C6.46C3F735@technologist.com>

Would it be useful to be able to do this:

def open( file		"The filename",
          mode="r" 	"The mode" ):

	"Open the file"

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Mon Nov 17 17:08:06 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Mon, 17 Nov 1997 12:08:06 -0500
Subject: [DOC-SIG] docstrings for args
In-Reply-To: <347077C6.46C3F735@technologist.com>
References: <347077C6.46C3F735@technologist.com>
Message-ID: <199711171708.MAA00794@weyr.cnri.reston.va.us>

Paul Prescod writes:
 > Would it be useful to be able to do this:
 > 
 > def open( file		"The filename",
 >           mode="r" 	"The mode" ):
		  ^^^^^^^^^^^^^^^^

  This is equivalent to mode="rThe mode" due to the string catenation
rule.  Some other form of separation would be necessary.
  A few sets of conventions exist for formatting the docstring such
that it can be picked apart.  I think Guido suggested most recently:

	def open(file, mode="r"):
	    """A short synopsis first.

	    file -- The filename
	    mode -- The mode

	    More descriptive text here...."""

  This is fairly readable and can be dealt with fairly easily by an
automatic extractor.  I have a class that parses docstrings like this; 
I'll try and clean up a few rough edges and document it over the next
day or two.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Robin.K.Friedrich@USAHQ.UnitedSpaceAlliance.com  Mon Nov 17 22:40:15 1997
From: Robin.K.Friedrich@USAHQ.UnitedSpaceAlliance.com (Friedrich, Robin K)
Date: Mon, 17 Nov 1997 16:40:15 -0600
Subject: [DOC-SIG] doc strings
Message-ID: <c=US%a=_%p=United_Space_All%l=USAHOUM2-971117224015Z-35507@uhqmail1.unitedspacealliance.com>

To follow up to Ed's doc string comments let me restate the latest
structured test proposal. I will not comment on an API for this meta-
information; that's a subject for another thread once we have agreed
to the contents of doc strings. This discussion follows from the
need for doc strings substantive enough to allow a reasonable user's
guide to be automatically generated with tools such as gendoc. 
That places it roughly in Guido's 4th category of doc string use.

For those who have not read the DOC SIG web page recently this posting
will try to encapsulate all current structured text capabilities as
working in gendoc 0.6 as well as some proposed enhancements.
[but please do read http://www.python.org/sigs/doc-sig/]


"""This is a one line description of the functionality.

A more detailed discussion of the object's function and purpose
may follow. Otherwise the author may choose to jump right into
the object's prototype information. I will identify optional key
text used for parsing in [bracketed] notation. As it stands now,
brackets have no special meaning. Subordinate paragraphs are
simply indented. Structured text currently has some implied rules
for determining what paragraphs are tagged as headers. (I for one
would like to see some more attention paid to that detail.) Any 
special characters like '<' for HTML need not be escaped in any
way as the renderer will be responsible for that.

	*Note* that "hyperlinks" are delimited by double quotes. In effect
	what this does is cause the string in the quotes to be saved off for
	further comparison to the URL lines at the end of the doc string. For
	an HTML rendering the quotes themselves would be removed and any
	quoted text which doesn't match exactly will be left alone. 
	
	Text on a single line surrounded by asterisks are tagged as
	'emphasis' (italic in HTML), while text wanting to be **loud** is
	surrounded by double asterisks and are tagged 'strong' (bold in
	"HTML" viewers). 
	
	Class objects should only document the class interface and leave the
	method doc to those individual doc strings. (This is not a hard rule
	as I've seen many coders insist on placing everything in the class
	doc.)

The following is new structure to support identification of function
prototype information.

[Required] Argument[s]:
	arg1 -- String containing a source file name.
	arg2 -- String containing a target file name.

Optional Argument[s]:
	arg3 -- (-1) Integer defaulted to -1 as shown by the parenthesis.
		Other text not having a double dash will be appended line's
		string; otherwise it would signify a new def list item. Ed
		marked optional arguments with brackets. Since any argument
		with a default value is optional this may not be necessary.
	arg4 -- ('') String. This is identified as new list item without
		having to have a blank line separating them. For long lists
		this is important. Note also that single quotes indicate
		literal (code) text and great care must be taken in the 
		parser to get this right. We might discuss alternative
		notation for literal strings. 

Keyword Argument[s]:
	opt1 -- (1) Defaults for python keywords are implemented in code so
		they cannot be extracted for the function declaration.
	opt2 -- (None) These lists can get mighty long.

Return Value:
	Tuple pair (perigee, apogee).

Example Usage:
	Any line ending in a colon containing the string 'example' will flag
	the following indented paragraphs as preformatted code until
	indentation returns to the next leftward level. This is not
	structured text protocol currently but is just an idea. This differs
	from the single quote usage which is just intended for short literal
	text not spanning lines.

* Bulleted lists can appear at any level of indentation and can be
	identified by either a '*', 'o', or a '-' as the first nonwhite
	character on the line. 
	* Indented bullet paragraphs are rendered accordingly.
	* Bulleted list items need not be separated by blank lines.
	* This is another item that's not easy to parse as a paragraph may
	start with *emphasized text*. 

.. "hyperlinks" http://www.python.org/sigs/doc-sig/
.. "HTML" http://www.w3c.org/
"""


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Mon Nov 17 21:43:07 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Mon, 17 Nov 1997 13:43:07 PST
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>
References: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>
Message-ID: <EoQ=df4B0KGW8pnO9D@holmes.parc.xerox.com>

Excerpts from ext.python: 15-Nov-97 [DOC-SIG] Library reference.. Guido
van Rossum@CNRI.Re (7617)

> it should be simple enough to rewrite
> the TIM-to-HTML converter in Python (maybe using HTMLgen?).

Probably make more sense to do a TIM-to-XML script in Python...

Excerpts from ext.python: 15-Nov-97 [DOC-SIG] Library reference.. Guido
van Rossum@CNRI.Re (7617)

> The other one is much hairier: conversion of the existing LaTeX source
> to TIM!

The first thing we'd need is a definition of the various markup terms we
wanted to use.  For example, is it better to say "\code" and assume
Python code, or "\python" so that we can also say "\C"?  Should we say
(as the lib sources currently do) "\code" for everything, or do we want
to distinguish the names of functions, say, from the names of modules by
using "\module" and "\function"?  Next, we could start by converting all
the strings of the form "\foo{" to "@foo{", which accomplishes a
remarkable amount of the work.  Then we'd need to replace various common
phrases like

	@renewcommand{@indexsubitem}{(<data,exception,...> in module <module>)}
	@begin{<data,func,exc}desc}{<name>}
	...
	@end{<name>}

with an appropriate TIM construct:

	@def{tp,fn,exc...} <name>
	@{tt,et,vt...}index <module>.<name> (<data,exception,...> in module <module>)
	...
	@end def{tp,fn,exc...}

Most of this can be accomplished with a few Emacs macros.

> One final note: I looked at Perl's POD (Plain Old Documentation) for a
> few seconds.  It's more limited than TIM and uses physical markup
> (e.g. B<words in bold>), but has one feature that I like: a block of
> indented text offset by blank lines (I believe) is automatically
> interpreted as a code sample block (verbatim in LaTeX terms,
> @codeexample in TIM).  This makes POD source remarkably readable.  I
> presume that it would be trivial to add this to the TIM front-end.  (I
> particularly like this idea because it's the same convention that I
> used in the Python FAQ wizard. :-)

I'll put it on my list.

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Mon Nov 17 21:56:53 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Mon, 17 Nov 1997 13:56:53 PST
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711161854.NAA21167@eric.CNRI.Reston.Va.US>
References: <9711161700.AA31749@arnold.image.ivab.se>
 <199711161854.NAA21167@eric.CNRI.Reston.Va.US>
Message-ID: <goQ=qZkB0KGW4pnOh0@holmes.parc.xerox.com>

Excerpts from ext.python: 16-Nov-97 Re: [DOC-SIG] Library refer.. Guido
van Rossum@CNRI.Re (3835)

> Note that one feature I like is that the LaTeX {funcdesc} environment
> automatically creates the index entry for the function; it combines it
> with some information provided earlier in the file:

>     \renewcommand{\indexsubitem}{(in module re)}

> I see that this is done manually in TIM (although I'm not sure why).

I didn't like the automatically-generated index terms (just "ORB_init"
-- I wanted "CORBA.ORB_init (Python LSR function)") that the default
rules for "deffn" provided, so I decide to explicitly specify what I
wanted in the index.  Perhaps the right way to fix this would have been
to extend the TIM front-end to understand a "index context string",
which would have been added to index entries automagically, but to give
us complete control over the entries I decided to use a separate line.

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Tue Nov 18 11:46:24 1997
From: papresco@technologist.com (Paul Prescod)
Date: Tue, 18 Nov 1997 06:46:24 -0500
Subject: [DOC-SIG] Library reference SGML plan
References: <9711162009.AA20858@arnold.image.ivab.se>
Message-ID: <34718010.FCDB7B5F@technologist.com>

Before I start -- how is all of this going to play out with 1.5 and
updates and so forth. Is now the right time to do a documentation
changeover?

If so, let me propose a reorganization of Fredrik's steps:

1. Hack the conversion from TeX to PyLibRef-SGML. Don't worry about
DocBook yet. We'll see how far PyLibRefSGML is from DocBook once we've
got an SGML document. (I can do this, but not for a few days)

2. Write the DTD (or at least a first draft) based on our observations
from 1. Work towards DocBook compatibility if possible because of the
benefits of standardization and code reuse. (I can do this after I
finish step 1).

3. Do the conversions to Print and HTML for our "demo".

4. Write a "howto" document on the entire system.

As for these:

> 5. write an SGML to XML converter using Grail's SGMLparser
>    (in the meantime, we can use James Clark's "sx" tool)
> 6. write an XML parser (at least a tokenizer) that some day
>    could be included in the standard Python distribution (almost
>    done!)

I don't know that we need these two steps. I like XML, but it doesn't
seem relevant to the task at hand. Also, why don't we have a single
parser for sgml/xml? Then we're agnostic. (I'm defining SGML here as XML
plus a few minimizations).

> 7. write an XML to HTML tool based on (6) and a "Python style
>    sheet" (almost done!)

This step makes a lot of sense, but I would say that we could be
SGML/XML agnostic here too.

> 8. write an XML to PostScript tool based on (6), the printer
>    formatter from edroff, and PIL's PSDraw (or maybe we could
>    use html2ps?)

This is where I get worried. You'll have to tell me more about edroff to
convince me that we're not taking on a humungous job here. I can't find
anything about it on the Web! Even if edroff is the easist tool in the
world to connect to, we should note that PostScript is not necessarily
the best print delivery format in the world. I much prefer to receive a
PDF or RTF file. With Jade, it seems we can deliver any of these -- PS,
PDF, RTF or TeX (not to mention MIF and whatever else someone adds
next). I'm reluctant to throw away the amazing job James has done
unifying those output formats so that support for any of them gives you
support for all of them. We would also be wasting the effort the
typesetting/wordprocessing tool vendors have put into line breaking
algorithms, etc. 

I'm not convinced that we should get hung up about the entire printing
process being in Python. If anyone doesn't want to install Jade, then
they can just view the HTML output for proofing purposes. Why should
every author's desktop be able to serve as a publishing hub?

Also, I would rather spend effort integrating Jade and Python to make a
more powerful, flexible publishing solution than currently exists,
rather than replacing Jade with a less powerful copycat written in a
hurry.

> Or maybe fuse 5 and 6.  But dealing with XML is much easier;
> an XML parser written in C could be added to Python without
> anyone noticing...

So could an SGML parser that does XML plus a few minimization
conventions.
 
 Paul Prescod


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Tue Nov 18 11:45:27 1997
From: papresco@technologist.com (Paul Prescod)
Date: Tue, 18 Nov 1997 06:45:27 -0500
Subject: [DOC-SIG] What I don't like about SGML
References: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
Message-ID: <34717FD7.386B724E@technologist.com>

Guido van Rossum wrote:
> 
> First, while SGML may have been standardized in the swinging '80s, it
> definitely has its roots in the '70s -- it takes many years to become
> an international standard (look at C++!), and it started its life, as
> "GML", long before standardization started.  Undoubtedly some of the
> worse features in SGML were designed to be backwards compatible
> (again, very much like C++...).

I don't doubt that SGML has some backwards compatible features, but it
is *not* backwards compatible with GML. The backwards compatibility
features mostly exist for people who think that something like TIM is
the greatest thing in the world and want to remake SGML in its image. 

Anyhow TeX, and thus TeXInfo and thus TIM also have their "roots in the
70s." Big deal. As far as I'm concerned, Python has its roots in the 70s
too.
 
> 99.9% of the time, HTML is parsed by relatively simple handwritten
> parsers, not by generic SGML scanners.  There are lots of programs out
> there that have to parse HTML -- preprocessors, web browsers, web
> spiders, etc.  Why don't these just link to an existing SGML scanner?
> Because SGML scanners are *huge*.  They need to be big to scan generic
> SGML, which is a very complex language.  But most of this power isn't
> needed to scan HTML, so people roll their own parser.

That's true. That's why we should stick to an SGML subset. I propose XML
+ minimizations.
 
> But Berners-Lee made one mistake: he made HTML look a bit like SGML
> (which he had seen once or twice from a distance :-).  

Berners-Lee's only mistake is that he didn't research SGML enough before
making HTML so that he had a lot of trouble bringing it back into the
SGML fold later.

> Almost
> immediately HTML was targeted by the SGML lobby for full compliance.

This is not true. Dan Connolly was the first person to propose an SGML
DTD for HTML. He is hardly in the "SGML Lobby" (talk to him about it
sometime, he has plenty of complaints about SGML) and the SGMLization of
HTML happened long before the SGML lobby really even understood the web.
Tim *hired* Dan to work with W3C and complete the work. In other words,
SGML was always Tim's idea. It goes back at least as far as 1993.

http://www.w3.org/MarkUp/draft-ietf-iiir-html-01.txt

I don't know about you, but I don't recall there being much a web to
"lobby" in 1993. Face it, Tim and Dan thought SGML was neat and they
implemented it. They have had a love/hate relationship ever since (as do
many people) but they have been moving towards SGML at every step (cf.
XML).

> Here's what was added; all of this made my parser much more
> complicated than I think it ought to be (look at how complicated
> sgmllib.py is).  Note that most of what was added doesn't add
> functionality.  In one or two cases it even takes away functionality!

I feel that there is an important point you are missing. SGML offers
lots of extra functionality beyond what HTML takes advantage of. If the
browser vendors (esp. Netscape) had not been explicitly SGML-hostile
(sound familiar?), the web would be much further ahead. But they have
fought tooth and nail to keep the useful features out.

> It just complicates the scanning process in order to be compatible
> with the extremely complicated scanning rules designed for SGML on
> punched cards in the 70s.

I don't know where you get this "punch card" stuff. GML was invented at
about the same time as C and UNIX, and after Simula 67. Goldfarb
invented it to be part of an *interactive document database system*.
Anyhow, this 70s/90s thing is only interesting if we've learned alot
about markup in the intervening 20 years. This doesn't seem to be the
case. TeXInfo, HTML and TIM really didn't introduce anything special
that SGML lacks. It seems the only thing we have learned since the
standardization of SGML is that some of its features are not as
important as we thought they would be. Fair enough -- lets not use them.
 
>     - A second special character '&' for entity references (original HTML
>     used <lt> to escape "<").

Big deal. Different markup for different things. Entity references can
go in attribute values and element content. They are NOT structural
sub-elements and should not be confused with them.

>     - Character references like &#32; or &#SPACE;.

How else are you going to include a Unicode character by number or name?
Are you going to claim that this isn't an "increase in functionality?"
If you need to input a greek character you might disagree.
 
>     - Comments in the form of <!--.....-->, truly the most atrocious
>     comment convention invented (and I believe it's worse -- officially,
>     "--" may not occur inside a comment but "-- --" may, or something like
>     that; but who cares, as almost no handwritten parser seems to get this
>     right).

Comments could be simpler and smaller, but it really doesn't seem like a
big deal to me.

>     - Special stuff to be ignored, starting with <!...>, where it is
>     tricky to determine what the end is (since sometimes "<" or ">" may
>     occur inside.

"<" or ">" can only occur inside *in quotes*. This is like complaining
that the following Python statement is confusing because of the two
colons:

if a=="j:b":

Big deal -- string literal context is different from program context (or
markup context, in SGML).

>     - Special stuff to be ignored, starting with <?...>.

What's so hard or complicated about that?
 
>     - Short tags, <word/.../, which are still mostly outlawed because of
>     compatibility reasons with older HTML processors, but which have to be
>     recognized if you want to clame the elusive "full compliance".

Obviously sgmllib.py will never have full SGML compliance. Presumably
the reason you implemented those short cuts is actually because they are
useful and convenient.

I feel that your negative feelings about a particular process have
spilled over onto SGML. If the browser vendors had done their job
correctly in the first place, these short cuts would be allowed, would
always have been allowed, and would be usable today. You can hardly
blame their SGML-noncompliance on SGML! I might as well blame a
particular Unixes posix incompatibilities on Unix!

>     - It is not possible to turn off processing completely.  There used to
>     be an HTML tag <LISTING> (?) which switched to literal copying of the
>     text until </LISTING> was found.  This is impossible to do in SGML --
>     the best you can do is to switch to literal mode until </ followed by

That is not true. The *DTD* cannot turn off processing completely as
with the LISTING tag. The *author* can turn off processing completely
with a marked section:

<![CDATA[
<<<<>>>>><<<<<>>>>>&&&&&&
]]>

The end of the marked section is indicated by "]]>." But this is going
to be VERY rarely required in Python documentation. The only Python code
that has a </ in it is code talking explicitly about SGML. So once in
every 30 listings, you'll have to use the syntax above. Note that this
syntax is one of the things that the HTML browsers have neglected to
implement, although it is VERY important as you point out. Don't blame
SGML, blame them.

>     a letter is seen, and you can't turn off &ref; processing either.

That isn't true. You can turn that type of processing off using either a
CDATA content element or a CDATA marked section.

>     - Why do I have to put quotes around the URL in <A
>     HREF="http://www.python.org"> ???

Attribute values are string literals, just like in Python. You put them
in quotes to differentiate them from the surrounding whitespace, markup
delimiters, etc.
 
>     - Other restrictions on what you can do with attributes; apparently
>     there's a semantic rule that says that if two unrelated tags have an
>     attribute with the same name, it must have the same "type".

That isn't true.

>     - A content model, which nobody asked for, and which few people check
>     for, but which still allows HTML purists to tell you that your HTML
>     page is "non-conformant" when you place an <H4> heading inside a <LI>
>     list item (okay, so I made that up).

I must admit, I'm shocked to hear you say that. It was exactly *for* the
content model that Tim Berners-Lee and Dan Connolly moved HTML to be an
SGML document type. Please tell me what Grail should do with this
document:

<HTML>
<H1>Here's a rather STRANGE HTMLish DOCUMENT</H1>
<TITLE>This is a title</TITLE>
<TITLE>This is another title</TITLE>
<TITLE>This is a third</TITLE>
<TITLE>Strange to have so many!</TITLE>
<TITLE>But without a content model</TITLE>
<TITLE>This is perfectly legal</TITLE>

<TABLE><LI><TD><TR>Here's a rather odd table</TD></TR></LI>
<P>Curiouser and Curiouser
</TABLE>
</HTML>

Without the concept of a content model, this is a perfectly legal
document, and Grail would have to handle it and do something reasonable
with it (what's the title of this document? what does the table
structure look like?) Without DTDs and content models, you have no basis
for an information system. The fact that HTML authors ignore SGML rules
is a sad commentary on the Web, not on HTML. Those who are building the
web today -- browser vendors and standardizers alike, have asked that
XML be extra strict because they recognize that the current HTML
situation is mess *in spite of* SGML's strictures (and *because of*
widespread SGML ignorance).

If you think it is reasonable to put H4s in LIs, then talk to Dan
Connolly. He can make it possible (in consultation with W3C members). If
you want to make it possible to put ANY element in ANY other element, he
could make that possible too. SGML can allow anything anywhere just like
TIM or LaTeX. But he wouldn't -- he knows that constraints on element
occurences are crucial. Removing them would be akin to asking Python
parsers to handle any random combination of operators and delimiters:

if ( def a(): class b(): pass )

>     - Probably a few other things that nobody asked for, such as the
>     DTD declaration and SGML's approach to character sets (which is
>     probably broken -- I believe there is a way to switch character
>     sets in mid-stream...).

The DTD is an important part of the documentation for HTML and also
important implementation tool for many vendors. I don't know what your
problem is with it.

I don't know that SGML's approach to character sets is broken. Could you
be more specific? And perhaps you could describe how TIM's "approach to
character sets" is superior.
 
> So my claim remains that the requirement of SGML conformance is for
> 99% just a nuisance for parser writers.  Of course I'm biased, since
> I'm a parser writer myself...  So see for yourself what you think of
> this argument.

Of course compliance with any standard is a nuisance. It is always
easier to hack up what you need as you go along. Because of powerful
anti-SGML politics, HTML never took advantage of much of SGML's power.
For instance one of SGML's most basic facilities is the ability to reuse
content in the same document or across documents. But HTML can't do it.
Blame the browser vendors.

Most of the points in your flame seem to me, to be more of an indictment
of anti-SGML bias than of SGML itself. It is as if someone tried out the
famed Posix compatibility mode in NT and then claimed that Unix was
broken based on it. Obviously that environment is not a true reflection
of Unix itself, because its creators were not trying to allow access to
the power of Unix. HTML was supposed to allow access to the power of
SGML, but then Marc took over the web and forward progress ground to a
halt in favour of <BLINK> and <CENTER>.

 Paul Prescod


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Tue Nov 18 14:48:33 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Tue, 18 Nov 1997 09:48:33 -0500
Subject: [DOC-SIG] What I don't like about SGML
In-Reply-To: Your message of "Tue, 18 Nov 1997 06:45:27 EST."
 <34717FD7.386B724E@technologist.com>
References: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
 <34717FD7.386B724E@technologist.com>
Message-ID: <199711181448.JAA24109@eric.CNRI.Reston.Va.US>

I think your attitude towards the web is just as unproductive as you
think mine towards SGML is.  Let's get on with it.

> The fact that HTML authors ignore SGML rules
> is a sad commentary on the Web, not on HTML.

I think you have a lot to learn.  The customer is always right.  I
would never say anything like that of Python users.

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Tue Nov 18 14:52:42 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Tue, 18 Nov 1997 09:52:42 -0500
Subject: [DOC-SIG] What I don't like about SGML
In-Reply-To: <34717FD7.386B724E@technologist.com>
References: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
 <34717FD7.386B724E@technologist.com>
Message-ID: <199711181452.JAA04006@weyr.cnri.reston.va.us>


Paul Prescod writes:
 > "<" or ">" can only occur inside *in quotes*. This is like complaining
 > that the following Python statement is confusing because of the two

Paul,
  I think Guido is referring to the markup declaration subset:

<!doctype blat "fooplace" [
  <!element ... - o empty>
  ]>

  Material like this is hard to deal with in off-the-cuff parsers such 
as the ones in Grail and sgmllib.  This is more of a problem of cheap
implementation than anything else; standards compliance is rare in
that environment.

 > SGML document type. Please tell me what Grail should do with this
 > document:

  [devious example elided]

  Roll over and play dead.  But not gracefully.  Even I think that the 
"what Grail should do" aspect of this is irrelevant; Grail is
targetted primarily toward the Web that's deployed, and only
secondarily to make my life tolerable.  I really don't expect many
people will use the "strict mode", esp. since only a few things are
improved and pages are shown to be broken.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Tue Nov 18 15:08:35 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Tue, 18 Nov 1997 10:08:35 -0500
Subject: [DOC-SIG] Library reference SGML plan
In-Reply-To: Your message of "Tue, 18 Nov 1997 06:46:24 EST."
 <34718010.FCDB7B5F@technologist.com>
References: <9711162009.AA20858@arnold.image.ivab.se>
 <34718010.FCDB7B5F@technologist.com>
Message-ID: <199711181508.KAA24186@eric.CNRI.Reston.Va.US>

> Before I start -- how is all of this going to play out with 1.5 and
> updates and so forth. Is now the right time to do a documentation
> changeover?

I think it's too late to get it all sorted out by the end of the year,
which is the planned release date for the final version of Python 1.5.
But then you might surprise me...!  Anyway I don't think there's a
need to synchronize the documentation effort that closely with the
source release effort -- I expect that more and more people get the
documentation (in PostScript or HTML) off the web site, so it's okay
if the documentation for 1.5 improves after 1.5 is out...!

Note that *if* you start doing the conversion based on the 1.5a4
release, you better be prepared to re-convert some documents that have
been modified (or added) since then.  Fred knows where they are.

> 4. Write a "howto" document on the entire system.

I think this may in the end be the most important document of all.
Very few people are capable of reverse engineering the rules properly
from the appearance of the printed documentation...!

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From fredrik@pythonware.com  Tue Nov 18 15:14:27 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 18 Nov 1997 16:14:27 +0100
Subject: [DOC-SIG] What I don't like about SGML
Message-ID: <01bcf434$a9317a00$6fadb4c1@fl-pc.image.ivab.se>

>The fact that HTML authors ignore SGML rules is a sad commentary on
>the Web, not on HTML.

You mean "HTML editors", don't you?  I don't know how many times
I've seen Tim B-L say that the whole idea behind the Web is that
*anyone* could publish their stuff.  You cannot really blaim the millions
of people that have done so for not reading a standardization document
that's not even freely available on the Web...

Cheers /F


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Tue Nov 18 16:02:11 1997
From: papresco@technologist.com (Paul Prescod)
Date: Tue, 18 Nov 1997 11:02:11 -0500
Subject: [DOC-SIG] What I don't like about SGML
References: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
 <34717FD7.386B724E@technologist.com> <199711181448.JAA24109@eric.CNRI.Reston.Va.US>
Message-ID: <3471BC03.DC4CAA43@technologist.com>

Guido van Rossum wrote:
> 
> > The fact that HTML authors ignore SGML rules
> > is a sad commentary on the Web, not on HTML.
> 
> I think you have a lot to learn.  The customer is always right.  

This has very little to do with the customers. The customers want to get
it right. The browser vendors have consistently undermined them in this
effort by silently accepting wrong documents without even providing a
*mode* that would give a hint that their documents are broken. 

Most HTML users are *amazed* when they are told how many mistakes there
are in their documents. They will typically respond: "But I saw that
broken construct on Netscape's web site!"

> I  would never say anything like that of Python users.

I didn't say anything bad about HTML users. I said something bad about
the Web as an information system. As a parser author, you know as well
as I do how broken it is in the area of document consistency.

If Python didn't give error messages when scripts were broken then the
state of the Python source base would similarly be a horrible mess. SGML
defined a concept of validity to prevent systems from getting into this
state. Browser vendors ignored that and have put themselves in a living
hell of ad hoc parser writing (so much so that they have inconvenienced
everybody by making XML over-strict to compensate).

Fredrik said:
> You mean "HTML editors", don't you?  

I think the browsers are more culpable. We've known for several years
that most HTML authors would not use editors and would use browsers as
their primary validation tool. You're right that HTML editors often
create bad HTML, however. They share the blame.

> You cannot really blaim the millions
> of people that have done so for not reading a standardization document
> that's not even freely available on the Web...

The HTML spec is on the web and most documents do not conform with it.
It has been a relatively "standalone" spec. for quite a while.

Fred said:
>  Roll over and play dead.  But not gracefully.  Even I think that the 
> "what Grail should do" aspect of this is irrelevant; Grail is
> targetted primarily toward the Web that's deployed, and only
> secondarily to make my life tolerable. 

Could you clarify what you are saying here? Are you arguing that Grail
already has to handle any tag in any place so the concept of standarding
them in a DTD is a waste of time? That even bothering to have a concept
of "correct HTML" isn't worth the effort? If the web is going to be an
information system then documents must conform to some minimum standard
of consistency. SGML seems to me the best tool to describe that
standard. Grail was just an example tool that makes up part of the
information system.

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From lemburg@uni-duesseldorf.de  Tue Nov 18 18:42:36 1997
From: lemburg@uni-duesseldorf.de (M.-A. Lemburg)
Date: Tue, 18 Nov 1997 19:42:36 +0100
Subject: [DOC-SIG] Library reference SGML plan
References: <9711162009.AA20858@arnold.image.ivab.se> <34718010.FCDB7B5F@technologist.com> <199711181508.KAA24186@eric.CNRI.Reston.Va.US>
Message-ID: <3471E19C.77A53967@uni-duesseldorf.de>

Guido van Rossum wrote:
> I expect that more and more people get the
> documentation (in PostScript or HTML) off the web site,...

Which is very convenient :) Say, wouldn't it make sense to
break the distribution into a program source and a documentation
source part ?! The preformatted versions of the docs are just so much
simpler to install and use than the source versions. I mean
honestly: how many users do really want to get latex or jade
running .before. seeing any of the neat manuals explaining
Python ? Getting Python to install is a piece of cake
compared to that...

-- 
Marc-Andre Lemburg


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Tue Nov 18 19:02:31 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Tue, 18 Nov 1997 14:02:31 -0500
Subject: [DOC-SIG] Library reference SGML plan
In-Reply-To: Your message of "Tue, 18 Nov 1997 19:42:36 +0100."
 <3471E19C.77A53967@uni-duesseldorf.de>
References: <9711162009.AA20858@arnold.image.ivab.se> <34718010.FCDB7B5F@technologist.com> <199711181508.KAA24186@eric.CNRI.Reston.Va.US>
 <3471E19C.77A53967@uni-duesseldorf.de>
Message-ID: <199711181902.OAA24978@eric.CNRI.Reston.Va.US>

[me]
> > I expect that more and more people get the
> > documentation (in PostScript or HTML) off the web site,...

[MA Lemburg]
> Which is very convenient :) Say, wouldn't it make sense to
> break the distribution into a program source and a documentation
> source part ?! The preformatted versions of the docs are just so much
> simpler to install and use than the source versions. I mean
> honestly: how many users do really want to get latex or jade
> running .before. seeing any of the neat manuals explaining
> Python ? Getting Python to install is a piece of cake
> compared to that...

Many packages these days come with preformatted HTML as the only
documentation, and it's mighty convenient.  I could do this, and
create separate bundles for the PostScript and latex.  Unfortunately,
it's not a space saver: e.g. the library manual HTML, tarred and
gzipped, is about 840K, while the latex is less than 240K (also tarred
and gzipped).  Take similar expansion factors for the other manual
pages, and you may see the tarred, gzipped Python distribution grow
from 5.4 Meg to 6.5 Meg -- after unzipping it will probably add 2 or 3
Meg.  Is this acceptable?  (On the other hand, it's easy enough for
most people to download the HTML separately if they need it, and it
*is* called a source distribution...)

--Guido van Rossum (home page: http://www.python.org/~guido/)


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From richardf@redbox.net  Tue Nov 18 20:07:14 1997
From: richardf@redbox.net (Richard Folwell)
Date: Tue, 18 Nov 1997 20:07:14 -0000
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
Message-ID: <01BCF45D.921A1510.richardf@redbox.net>

On Tuesday, November 18, 1997 7:03 PM, Guido van Rossum 
[SMTP:guido@CNRI.Reston.Va.US] wrote:
> Many packages these days come with preformatted HTML as the only
> documentation, and it's mighty convenient.

I agree.  We do this.  We do also provide PDF, mainly because it is a better 
format if you wanted printed output.

> I could do this, and
> create separate bundles for the PostScript and latex.  Unfortunately,
> it's not a space saver: e.g. the library manual HTML, tarred and
> gzipped, is about 840K, while the latex is less than 240K (also tarred
> and gzipped).

I do not think that this sort of size increase is an issue nowadays, so long as 
there is an optional way of getting the distribution without it

> Take similar expansion factors for the other manual
> pages, and you may see the tarred, gzipped Python distribution grow
> from 5.4 Meg to 6.5 Meg -- after unzipping it will probably add 2 or 3
> Meg.  Is this acceptable?

With current hard disk prices, I think that this is completely acceptable.

Richard Folwell

RedBox Technologies, The Media Router Company, http://www.redbox.net
Email: richardf@redbox.net
Voice: +44 181 585 8565             Fax: +44 181 585 8665


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Tue Nov 18 23:20:03 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Tue, 18 Nov 1997 15:20:03 PST
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
In-Reply-To: <01BCF45D.921A1510.richardf@redbox.net>
References: <01BCF45D.921A1510.richardf@redbox.net>
Message-ID: <goQW_X8B0KGWQpnPgW@holmes.parc.xerox.com>

Excerpts from ext.python: 18-Nov-97 RE: [DOC-SIG] Library refer..
Richard Folwell@redbox.n (1366*)

> I agree.  We do this.  We do also provide PDF, mainly because it is a better 
> format if you wanted printed output.

Yes, I'd rather have PDF instead of Postscript:  it has a good model of
page-formatted documents, as Postscript does; it supports links, which
Postscript doesn't; it allows cut and paste (well, mainly just cut)
operations on text in the document, which Postscript doesn't;
`ghostscript' works quite well on it, as with Postscript.

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Tue Nov 18 23:34:18 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Tue, 18 Nov 1997 18:34:18 -0500
Subject: [DOC-SIG] Re: Python Documentation
In-Reply-To: <64snke$k8c@netaxs.com>
References: <3469C0FA.B1123B09@technologist.com>
 <64cq92$dpk@netaxs.com>
 <87en4inu83.fsf@cyteen.Helsinki.FI>
 <64nqq1$ht9@netaxs.com>
 <87k9e7yxud.fsf@cyteen.Helsinki.FI>
 <64snke$k8c@netaxs.com>
Message-ID: <199711182334.SAA11535@weyr.cnri.reston.va.us>

Michael W. Ryan writes:
 > I keep repeating parts of this.  Please keep the following things in mind:

  Ok, I bit.  I just pulled down 0.99.20 and read the latest
incarnation of the DTD.  I'll try and explain why I don't think the
SGML-Tools suite is good for the project at hand (the Python Library
Reference).

 > 1) SGML-Tools isn't just for documenting Linux.  It's designed for doing
 > technical documentation.

  This was never an issue.

 > 2) SGML-Tools isn't just a DTD.  It's an entire set of tools.

  Yes, but they are still tightly tied to the DTD; I followed the list 
for a long time and the basic assumptions being made just didn't seem
to scale.  Perhaps this has changed, but the cast of characters
appears to have remained quite stable since I last followed the
mailing list.

 > 3) There has been discussion of structuring the package such that other
 > DTDs can be used.

  Some discussion?  Other tools have been designed for this from the
start; that counts quite a bit in my book.

 > 4) SGML-Tools has undergone alot of changes over the original
 > Linuxdoc-SGML.  These changes include a more logical style of markup and
 > becoming more markup-driven as opposed to backend-driven (which was an

  I agree, it has improved.  But it's very basic and doesn't support
the highly structured information in the Python Library Reference.
Sure, it could be extended to do so.  I think using an existing,
tested, and supported DTD provides significant benefits over having to 
(largely) start over.
  I'm not knocking what the SGML-Tools crew has done, I just don't
think it's as far along as it could be, or as other tools are.
  Please direct follow-ups to doc-sig@python.org, as the primary
discussion has moved to that forum.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From mryan@netaxs.com  Wed Nov 19 02:08:07 1997
From: mryan@netaxs.com (Michael W. Ryan)
Date: Tue, 18 Nov 1997 21:08:07 -0500 (EST)
Subject: [DOC-SIG] Re: Python Documentation
In-Reply-To: <199711182334.SAA11535@weyr.cnri.reston.va.us>
Message-ID: <Pine.LNX.3.95.971118210409.286A-100000@luinil.netaxs.com>

On Tue, 18 Nov 1997, Fred L. Drake wrote:

>   Ok, I bit.  I just pulled down 0.99.20 and read the latest
> incarnation of the DTD.  I'll try and explain why I don't think the
> SGML-Tools suite is good for the project at hand (the Python Library
> Reference).

These are arguments against that I can accept.

>  > 3) There has been discussion of structuring the package such that other
>  > DTDs can be used.
> 
>   Some discussion?  Other tools have been designed for this from the
> start; that counts quite a bit in my book.

That's a good point.  I don't object to those other tools as long as they
are accessible and reasonably easy to use.

>   I agree, it has improved.  But it's very basic and doesn't support
> the highly structured information in the Python Library Reference.
> Sure, it could be extended to do so.  I think using an existing,
> tested, and supported DTD provides significant benefits over having to 
> (largely) start over.
>   I'm not knocking what the SGML-Tools crew has done, I just don't
> think it's as far along as it could be, or as other tools are.
>   Please direct follow-ups to doc-sig@python.org, as the primary
> discussion has moved to that forum.

If you're on the doc-sig list and want to reply to me directly, please
email me.  I'm not on the sig list.  I'm thinking of joining, but it
depends on the average volume.

Michael W. Ryan
Email:  mryan@netaxs.com           WWW:  http://www.netaxs.com/~mryan/

PGP fingerprint: 7B E5 75 7F 24 EE 19 35  A5 DF C3 45 27 B5 DB DF
PGP public key available by fingering mryan@unix.netaxs.com (use -l opt)


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Wed Nov 19 04:36:27 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Tue, 18 Nov 1997 23:36:27 -0500
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
In-Reply-To: Your message of "Tue, 18 Nov 1997 15:20:03 PST."
 <goQW_X8B0KGWQpnPgW@holmes.parc.xerox.com>
References: <01BCF45D.921A1510.richardf@redbox.net>
 <goQW_X8B0KGWQpnPgW@holmes.parc.xerox.com>
Message-ID: <199711190436.XAA25794@eric.CNRI.Reston.Va.US>

> From: Bill Janssen <janssen@parc.xerox.com>

> Yes, I'd rather have PDF instead of Postscript:  it has a good model of
> page-formatted documents, as Postscript does; it supports links, which
> Postscript doesn't; it allows cut and paste (well, mainly just cut)
> operations on text in the document, which Postscript doesn't;
> `ghostscript' works quite well on it, as with Postscript.

I'm in an inflammatory mood this week.  It must be my hormones :-)

While I agree that PDF has advantages over Postscript, what drives me
crazy is people putting up documents on the web in PDF format. for
online viewing.  What's wrong with this?  It puts PAGE BREAKS in the
middle of paragraphs.  Page breaks are a print media feature.  I don't
want page breaks in document I'm viewing on line -- the printed paper
page size has no relationship on my screen.

Please use PDF for printable files -- not for on line documents!

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Wed Nov 19 05:37:20 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Tue, 18 Nov 1997 21:37:20 PST
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
In-Reply-To: <199711190436.XAA25794@eric.CNRI.Reston.Va.US>
References: <01BCF45D.921A1510.richardf@redbox.net>
 <goQW_X8B0KGWQpnPgW@holmes.parc.xerox.com>
 <199711190436.XAA25794@eric.CNRI.Reston.Va.US>
Message-ID: <YoQbgE4B0KGWApnSA2@holmes.parc.xerox.com>

Excerpts from ext.python: 18-Nov-97 Re: [DOC-SIG] Library refer.. Guido
van Rossum@CNRI.Re (1146)

> While I agree that PDF has advantages over Postscript, what drives me
> crazy is people putting up documents on the web in PDF format. for
> online viewing.  What's wrong with this?  It puts PAGE BREAKS in the
> middle of paragraphs.  Page breaks are a print media feature.  I don't
> want page breaks in document I'm viewing on line -- the printed paper
> page size has no relationship on my screen.

While I agree with this in spirit, there is an incredibly long and
illustrious history of page-oriented rhetorical science (art?).  PDF is
a good Net way to make use of this science, though I agree that it's
irritating.  I'd rather have things adapt to the size and shape I give
them, if it doesn't matter to the author.

Reminds me again of the Halasz split between "card sharks" (people who
like card (and page) oriented hypertext) and "holy scrollers" (people
who like endless-scrolled-document style hypertext).  HTML has made us
all holy scrollers whether we like it or not...

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Wed Nov 19 05:49:56 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Wed, 19 Nov 1997 00:49:56 -0500
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
In-Reply-To: Your message of "Tue, 18 Nov 1997 21:37:20 PST."
 <YoQbgE4B0KGWApnSA2@holmes.parc.xerox.com>
References: <01BCF45D.921A1510.richardf@redbox.net> <goQW_X8B0KGWQpnPgW@holmes.parc.xerox.com> <199711190436.XAA25794@eric.CNRI.Reston.Va.US>
 <YoQbgE4B0KGWApnSA2@holmes.parc.xerox.com>
Message-ID: <199711190549.AAA26189@eric.CNRI.Reston.Va.US>

[me]
> > While I agree that PDF has advantages over Postscript, what drives me
> > crazy is people putting up documents on the web in PDF format. for
> > online viewing.  What's wrong with this?  It puts PAGE BREAKS in the
> > middle of paragraphs.  Page breaks are a print media feature.  I don't
> > want page breaks in document I'm viewing on line -- the printed paper
> > page size has no relationship on my screen.

[Bill]
> While I agree with this in spirit, there is an incredibly long and
> illustrious history of page-oriented rhetorical science (art?).  PDF is
> a good Net way to make use of this science, though I agree that it's
> irritating.  I'd rather have things adapt to the size and shape I give
> them, if it doesn't matter to the author.
> 
> Reminds me again of the Halasz split between "card sharks" (people who
> like card (and page) oriented hypertext) and "holy scrollers" (people
> who like endless-scrolled-document style hypertext).  HTML has made us
> all holy scrollers whether we like it or not...

I'm not a big fan of endless scrolling (in fact this is at the moment
my main gripe with my own Python website :-) but if I get separations
I want the separations to be semantically meaningful.  Page oriented
hypertext presumably does this.  The typical PDF document however is
created by using a wordprocessor to produce an endless scrolling
document, automatically producing a page lay-out for a particular
paper size, and then extracting PDF...

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From richardf@redbox.net  Wed Nov 19 09:25:21 1997
From: richardf@redbox.net (Richard Folwell)
Date: Wed, 19 Nov 1997 09:25:21 -0000
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
Message-ID: <01BCF4CD.107E7FD0.richardf@redbox.net>

On Wednesday, November 19, 1997 4:36 AM, Guido van Rossum 
[SMTP:guido@CNRI.Reston.Va.US] wrote:
> > From: Bill Janssen <janssen@parc.xerox.com>
> > Yes, I'd rather have PDF instead of Postscript:  it has a good model of
> > page-formatted documents, as Postscript does; it supports links, which
>
> While I agree that PDF has advantages over Postscript, what drives me
> crazy is people putting up documents on the web in PDF format. for
> <snip>
> Please use PDF for printable files -- not for on line documents!

I agree with this completely.  PDF is atrocious for online viewing, for a 
number of reasons [1].  Overall I prefer Postscript, but PDF has the big 
practical advantage that we can distribute a viewer with it, whereas many 
people are not in a position to view/print Postscript [2].  For online reading 
we install HTML, and this seems to be the best for the moment [3].

Richard Folwell

RedBox Technologies, The Media Router Company, http://www.redbox.net
Email: richardf@redbox.net
Voice: +44 181 585 8565             Fax: +44 181 585 8665

[1] Page breaks, as Guido said.  Generally poor quality of displayed type is 
harder to read than native fonts used for HTML.  Poor use of available display 
space.  Awkward re-sizing controls.  No control over the general display and 
layout by the viewer (though an increasing amount of published HTML breaks this 
important quality of good HTML).  Line breaking done at distill time, not 
display time.

[2] Yes, I know about GhostScript, but it is not as simple to use, nor as 
polished as the Acrobat viewer.

[3] I also like WinHelp, but understand that a few people use systems that 
cannot read it ;-)


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From fredrik@pythonware.com  Wed Nov 19 10:05:06 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 19 Nov 1997 11:05:06 +0100
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
Message-ID: <01bcf4d2$9c67f760$6fadb4c1@fl-pc.image.ivab.se>

Guido wrote:
>Please use PDF for printable files -- not for on line documents!

But for the publisher, there's no difference, is it?  The only way
to prevent people (using a capable browser) to look at the PDF
document is to compress it...  which is pretty stupid since PDF 3
supports ZIP compression itself...  And at least it gives you the
possibility to preview the document you're about to print.

But of course, if someone could easily have created HTML, it's
a pretty lousy idea to publish only a PDF.  But it could be worse.
They could publish Word documents only ;-)

Bill wrote:
> Reminds me again of the Halasz split between "card sharks" (people who
> like card (and page) oriented hypertext) and "holy scrollers" (people
> who like endless-scrolled-document style hypertext).  HTML has made us
> all holy scrollers whether we like it or not...

Just read an article where some information guru pointed out that
very few users actually used the scrollbar, unless the page contained
some really, really, important information...

Cheers /F


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Wed Nov 19 10:10:55 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 19 Nov 1997 05:10:55 -0500
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
References: <01BCF4CD.107E7FD0.richardf@redbox.net>
Message-ID: <3472BB2F.5A85FDF8@technologist.com>

Richard Folwell wrote:
> I agree with this completely.  PDF is atrocious for online viewing, for a
> number of reasons [1].  Overall I prefer Postscript, but PDF has the big
> practical advantage that we can distribute a viewer with it, whereas many
> people are not in a position to view/print Postscript [2].  

Yes, PDF is very seldom appropriate for online display. Theoretically
you can make scrollable PDF, but nobody goes to that effort. 

I should also mention (responding now to Bill) that is is possible to
make card-like HTML that is much more intelligent than a page breaking
algorithm intended for print. It is tricky to know where to make the
page breaks, since we all run at different resolution, but even a naive
algorithm can do better than a print-oriented one.

> For online reading  we install HTML, and this seems to be the best 
> for the moment [3].
> [3] I also like WinHelp, but understand that a few people use systems that
> cannot read it ;-)

Note that WinHelp is being replaced by an HTML variant. Hopefully you
will just have to supply a few specially organized (XML?) index files
and TOC files to turn an ordinary HTML page collection into a winhelp
file. I haven't investigated enough to know for sure.

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From lemburg@uni-duesseldorf.de  Wed Nov 19 12:03:08 1997
From: lemburg@uni-duesseldorf.de (M.-A. Lemburg)
Date: Wed, 19 Nov 1997 13:03:08 +0100
Subject: [DOC-SIG] Library reference SGML plan
References: <01BCF4CD.107E7FD0.richardf@redbox.net> <3472BEE6.749C3FCA@uni-duesseldorf.de>
Message-ID: <3472D57C.1127BDE6@uni-duesseldorf.de>

Guido van Rossum wrote:
> 
> Many packages these days come with preformatted HTML as the only
> documentation, and it's mighty convenient.  I could do this, and
> create separate bundles for the PostScript and latex.  Unfortunately,
> it's not a space saver: e.g. the library manual HTML, tarred and
> gzipped, is about 840K, while the latex is less than 240K (also tarred
> and gzipped).  Take similar expansion factors for the other manual
> pages, and you may see the tarred, gzipped Python distribution grow
> from 5.4 Meg to 6.5 Meg -- after unzipping it will probably add 2 or 3
> Meg.

I wasn't thinking of space savings here... for first time contact
to Python I think HTML is the best way to get people interested.
Then make PDF available from the web site and they'll get really
happy :)

> Is this acceptable?

Yep.

> (On the other hand, it's easy enough for
> most people to download the HTML separately if they need it, and it
> *is* called a source distribution...)

It's for you to decide. As far as I'm concerned I never used the
Doc/-files in all the years -- couldn't get them to compile the
first time and after that never touched them again. Instead I
download your HTML distribution every time a new release is out.
[So for Joe Average like me, you'd have to add the two above
figures :]

Richard Folwell wrote:
> 
> Overall I prefer Postscript, but PDF has the big
> practical advantage that we can distribute a viewer with it, whereas many
> people are not in a position to view/print Postscript [2].  For online reading
> we install HTML, and this seems to be the best for the moment [3].

As Guido said: PDF is good for printing, in fact it is the easiest
way to get printed manuals in a portable way, since not everybody
can enjoy ghostscript or has access to a postscript printer --
the acrobat reader provides a nice document to printer interface
here. Depending on how you convert XYZ to PDF it also allows
fulltext searching, which comes in handy too...

For online viewing I very much prefer HTML. A micro HTTP-
server with searching capabilities written totally in
Python would be good idea here too. This could be specialized
to Pythondoc searching, e.g. use a generated index (I believe
with the highly structured format the lib ref is in, this shoudn't
be much of a problem). Anybody out there with experience in
this field ? The HTTP-server is already there...HintHintHint ;-)

Some other ideas[1]:
- the index could be used for an online help system: instead
  of looking at a docstring (most of which are still missing),
  you'd type help('str') and through the Netscape remote control
  interface the right manual page pops up... or maybe a Tk/Tcl
  manual browser... or just plain old text in a more-like
  browser
- install the docs in the same place, the libs live in; in a
  multi-user environment this saves headaches and aids in things
  like coding a help system
- add some kind of way in which extensions can hook into this
  system -- 'make install' (using Makefile.pre.in) could also
  install the docs for the extension; this is easy to do for
  HTML docs, since adding a few links to a special extension
  page is simple (you don't even need to recompile the docs
  for this to work -- only add the new entries to the index).

[1] Sorry, don't have the time to look any further into this :(

-- 
Marc-Andre Lemburg


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Edward Welbourne <eddyw@lsl.co.uk>  Wed Nov 19 12:36:38 1997
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Wed, 19 Nov 1997 12:36:38 GMT
Subject: [DOC-SIG] Library reference SGML plan
In-Reply-To: <199711181902.OAA24978@eric.CNRI.Reston.Va.US>
References: <9711162009.AA20858@arnold.image.ivab.se>
 <34718010.FCDB7B5F@technologist.com>
 <199711181508.KAA24186@eric.CNRI.Reston.Va.US>
 <3471E19C.77A53967@uni-duesseldorf.de>
 <199711181902.OAA24978@eric.CNRI.Reston.Va.US>
Message-ID: <9711191236.AA14468@lslr6g.lsl.co.uk>

M-A Lemburg:
> Say, wouldn't it make sense to break the distribution into a program
> source and a documentation source part ?!
Guido:
> (On the other hand, it's easy enough for most people to download the
> HTML separately if they need it, and it *is* called a source
> distribution...)

So split the standard `source' distribution into code and doc, to get
the following as installable lumps:

 * The (source) code distribution
 * The pre-built binaries
 * (The python-in-a-box setup the advocates have discussed)

 * The (source) doc distribution
 * The pre-formatted docs (in each supported format)

(Quick bit of doc-sig advocacy - notice the ease with which a UL can be
typed in `structured text'.)  Then we have the Full Source (code and
doc) available, plus `ready-to-use' forms thereof, and we can pick and
mix - chosing which bits we want in full source form and which in
ready-mixed form.  Eg: personally, I would probably download the source
code and pre-built HTML docs.

This takes up more space on python.org's machines (just as pre-built
binaries do) but will mean python users only use the space needed for
the package components we're using.  And just as pre-built binaries
spare folk the need to have make, cc, ... installed, the ready-made docs
will spare folk the need to tame a LaTeX installation (&c).

	Eddy.

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Edward Welbourne <eddyw@lsl.co.uk>  Wed Nov 19 12:51:16 1997
From: Edward Welbourne <eddyw@lsl.co.uk> (Edward Welbourne)
Date: Wed, 19 Nov 1997 12:51:16 GMT
Subject: [DOC-SIG] Library reference SGML plan / size of supplied
 documentation
In-Reply-To: <01bcf4d2$9c67f760$6fadb4c1@fl-pc.image.ivab.se>
References: <01bcf4d2$9c67f760$6fadb4c1@fl-pc.image.ivab.se>
Message-ID: <9711191251.AA34760@lslr6g.lsl.co.uk>

> Just read an article where some information guru pointed out that very
> few users actually used the scrollbar, unless the page contained some
> really, really, important information...

(T)read with care.
90% of all pages are recognisable as junk within the first screen-full.
90% of users are just skimming.
90% of pages are less information-rich than the python doc-set.
88.2% of all statistics are made up on the spot.
Naive reasoning from statistical information is nearly always flawed.
`very few users' (of the web) use python.
Disproportionately many visitors to python doc pages will be using
scroll-bars.

The decomposition of an on-line document into pages should be done on
semantic grounds, not `how big is a page' grounds.  (How big is half a
hole in the ground ?)  Authors shouldn't worry about page-size, beyond
the boring practicality of finding some semantic excuse for cutting a
document into modest-sized lumps.  Authors should never assume - let
alone impose - anything at all about (on) the window dimensions or
font-set being used in the browser.

	Eddy.

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Wed Nov 19 14:37:41 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Wed, 19 Nov 1997 09:37:41 -0500
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
In-Reply-To: Your message of "Wed, 19 Nov 1997 09:25:21 GMT."
 <01BCF4CD.107E7FD0.richardf@redbox.net>
References: <01BCF4CD.107E7FD0.richardf@redbox.net>
Message-ID: <199711191437.JAA27608@eric.CNRI.Reston.Va.US>

> From: Richard Folwell <richardf@redbox.net>

> Overall I prefer Postscript, but PDF has the big 
> practical advantage that we can distribute a viewer with it, whereas many 
> people are not in a position to view/print Postscript [2].

Really?  If we distribute HTML for online viewing, does PDF still
have any advantages over PostScript for printing?  I've rarely had
complaints about the PostScript I've been distributing so far.  Having
to dwonload the viewer is actually a deterrent for those people who
haven't already got one.

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From fredrik@pythonware.com  Wed Nov 19 14:52:58 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 19 Nov 1997 15:52:58 +0100
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
Message-ID: <01bcf4fa$d37087e0$6fadb4c1@fl-pc.image.ivab.se>

>> Overall I prefer Postscript, but PDF has the big 
>> practical advantage that we can distribute a viewer with it, whereas many 
>> people are not in a position to view/print Postscript [2].
>
>Really?  If we distribute HTML for online viewing, does PDF still
>have any advantages over PostScript for printing?  I've rarely had
>complaints about the PostScript I've been distributing so far.  Having
>to dwonload the viewer is actually a deterrent for those people who
>haven't already got one.

Definitely; not all Windows users have postscript printers,
for example.  And PDF files are usually much smaller, which
is good for modem users.

Cheers /F


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From jim.fulton@digicool.com  Wed Nov 19 14:59:27 1997
From: jim.fulton@digicool.com (Jim Fulton)
Date: Wed, 19 Nov 1997 09:59:27 -0500
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
References: <01BCF4CD.107E7FD0.richardf@redbox.net> <199711191437.JAA27608@eric.CNRI.Reston.Va.US>
Message-ID: <3472FECF.5939@digicool.com>

Guido van Rossum wrote:
> 
> > From: Richard Folwell <richardf@redbox.net>
> 
> > Overall I prefer Postscript, but PDF has the big
> > practical advantage that we can distribute a viewer with it, whereas many
> > people are not in a position to view/print Postscript [2].
> 
> Really?  If we distribute HTML for online viewing, does PDF still
> have any advantages over PostScript for printing? 

Yes, for those of us without Postscript printers.

> I've rarely had
> complaints about the PostScript I've been distributing so far.  Having
> to dwonload the viewer is actually a deterrent for those people who
> haven't already got one.

And when you do download the PDF viewer, you find it basically sucks.

Jim

-- 
Jim Fulton            jim@digicool.com
Technical Director    540.371.6909                Python Powered!
Digital Creations     http://www.digicool.com/    http://www.python.org/

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From richardf@redbox.net  Wed Nov 19 16:13:54 1997
From: richardf@redbox.net (Richard Folwell)
Date: Wed, 19 Nov 1997 16:13:54 -0000
Subject: [DOC-SIG] Library reference SGML plan / size of supplied docu
 mentation
Message-ID: <010842EDED34D1119DC300A0C943C49D0F1D0F@LONEXCH>

Sorry, "we" in this context meant "RedBox Technologies".  I was
referring to the factors that affected our decision for an application
distributed on CD.  Apologies if this has resulted in a red herring.
For Python I think that PostScript is better (produces, to my eye,
better quality printouts).

Richard

> -----Original Message-----
> From:	Guido van Rossum [SMTP:guido@CNRI.Reston.Va.US]
> Sent:	Wednesday, November 19, 1997 2:38 PM
> To:	doc-sig@python.org
> Subject:	Re: [DOC-SIG] Library reference SGML plan / size of
> supplied documentation
> 
> > From: Richard Folwell <richardf@redbox.net>
> 
> > Overall I prefer Postscript, but PDF has the big 
> > practical advantage that we can distribute a viewer with it, whereas
> many 
> > people are not in a position to view/print Postscript [2].
> 
> Really?  If we distribute HTML for online viewing, does PDF still
> have any advantages over PostScript for printing?  I've rarely had
> complaints about the PostScript I've been distributing so far.  Having
> to dwonload the viewer is actually a deterrent for those people who
> haven't already got one.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> 

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Wed Nov 19 16:43:34 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Wed, 19 Nov 1997 11:43:34 -0500
Subject: [DOC-SIG] Postscript vs. PDF
Message-ID: <199711191643.LAA02786@eric.CNRI.Reston.Va.US>

OK, it seems that I should provide HTML, PostScript and PDF.  I don't
mind using extra disk space on python.org (though some mirror sites
might?), but I do mind the work involved in creating all the different
distributions (I don't believe that this can be fully automated --
enough things change between distributions that the automation would
have to be changed each time...)

So I propose the following set of distributions:

    - Full source distribution, containing C source, Latex doc source, and
    the standard library

    - Small source distribution, containing C source and the standard library

    - HTML distribution

    - PostScript distribution

    - PDF distribution

    - Platform specific Unix binary distributions; these don't include
    HTML nor the standard library, only the python binary and dynamically
    loaded extensions (if applicable)

    - Addeddum for platform specific Unix binary distributions, containing
    the standard library and the HTML docs

BTW I believe that in order to create PDF one normally needs to buy an
Adobe program.  Is there a free alternative?

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From johnm@magnet.com  Wed Nov 19 17:13:21 1997
From: johnm@magnet.com (John Mitchell)
Date: Wed, 19 Nov 1997 12:13:21 -0500 (EST)
Subject: [DOC-SIG] Postscript vs. PDF
In-Reply-To: <199711191643.LAA02786@eric.CNRI.Reston.Va.US>
Message-ID: <Pine.SGI.3.96.971119115849.22717b-100000@lemur.magnet.com>

On Wed, 19 Nov 1997, Guido van Rossum wrote:

> BTW I believe that in order to create PDF one normally needs to buy an
> Adobe program.  Is there a free alternative?
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)

Yes and no.  GNU Ghostscript(*) includes a "ps2pdf".  Alas, if the
PostScript font used is unknown, each character is rendered seperately (!)
and pasted together into the PDF version. 

That is, the output tends to look terrible.  Perhaps dorking with the 'gs'
settings would produce better results -- I only gave it the once-over.

- j


*: Ghostscript 5.03 info is at:
	http://www.cs.wisc.edu/~ghost/


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Wed Nov 19 17:42:27 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 19 Nov 1997 12:42:27 -0500
Subject: [DOC-SIG] Postscript vs. PDF
References: <199711191643.LAA02786@eric.CNRI.Reston.Va.US>
Message-ID: <34732503.9FEC0B3C@technologist.com>

Guido van Rossum wrote:
> BTW I believe that in order to create PDF one normally needs to buy an
> Adobe program.  Is there a free alternative?

PDFTEX. It came with my free miktex installation. Once it is correctly
installed, this should be all you need to do:

pdftex foo.tex

pdftex is another reason to keep TeX variant in the documentation loop.

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From scott@chronis.icgroup.com  Wed Nov 19 18:14:34 1997
From: scott@chronis.icgroup.com (Scott)
Date: Wed, 19 Nov 1997 13:14:34 -0500
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
In-Reply-To: <199711191437.JAA27608@eric.CNRI.Reston.Va.US>; from Guido van Rossum on Wed, Nov 19, 1997 at 09:37:41AM -0500
References: <01BCF4CD.107E7FD0.richardf@redbox.net> <199711191437.JAA27608@eric.CNRI.Reston.Va.US>
Message-ID: <19971119131434.52216@chronis.icgroup.com>


Personally, I've found postscript much easier to deal with than pdf.

scott

On Wed, Nov 19, 1997 at 09:37:41AM -0500, Guido van Rossum wrote:
| > From: Richard Folwell <richardf@redbox.net>
| 
| > Overall I prefer Postscript, but PDF has the big 
| > practical advantage that we can distribute a viewer with it, whereas many 
| > people are not in a position to view/print Postscript [2].
| 
| Really?  If we distribute HTML for online viewing, does PDF still
| have any advantages over PostScript for printing?  I've rarely had
| complaints about the PostScript I've been distributing so far.  Having
| to dwonload the viewer is actually a deterrent for those people who
| haven't already got one.
| 
| --Guido van Rossum (home page: http://www.python.org/~guido/)
| 
| _______________
| DOC-SIG  - SIG for the Python Documentation Project
| 
| send messages to: doc-sig@python.org
| administrivia to: doc-sig-request@python.org
| _______________

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From tismer@appliedbiometrics.com  Wed Nov 19 20:37:54 1997
From: tismer@appliedbiometrics.com (Christian Tismer)
Date: Wed, 19 Nov 1997 21:37:54 +0100
Subject: [DOC-SIG] Postscript vs. PDF
References: <199711191643.LAA02786@eric.CNRI.Reston.Va.US>
Message-ID: <34734E22.C750B646@appliedbiometrics.com>

> BTW I believe that in order to create PDF one normally needs to buy an
> Adobe program.  Is there a free alternative?

Yes :)
Send the .PS files to me and I run it through Adobe Distiller for you.
We have the complete Developer kit here.

cheers - pirx

-- 
Christian Tismer             :^)   <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH      :     Have a break! Take a ride on Python's
Kaiserin-Augusta-Alle 101    :    *Starship* http://starship.skyport.net
10553 Berlin                 :     PGP key -> http://pgpkeys.mit.edu
     we're tired of banana software - shipped green, ripens at home

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Wed Nov 19 22:11:57 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 19 Nov 1997 14:11:57 PST
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
In-Reply-To: <19971119131434.52216@chronis.icgroup.com>
References: <01BCF4CD.107E7FD0.richardf@redbox.net> <199711191437.JAA27608@eric.CNRI.Reston.Va.US>
 <19971119131434.52216@chronis.icgroup.com>
Message-ID: <QoQqEhQB0KGWMpnPNf@holmes.parc.xerox.com>

Excerpts from ext.python: 19-Nov-97 Re: [DOC-SIG] Library refer..
Scott@chronis.icgroup.co (1119*)

> Personally, I've found postscript much easier to deal with than pdf.

Me, too.  But I feel that Ghostscript now supports PDF, and printers are
beginning to -- given the advantage of being able to cut text from it,
and the advantage of being able to put links in it, I'd just as soon
spend the extra effort.

Of course, the tool chain has to preserve those two capabilities.

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From janssen@parc.xerox.com  Wed Nov 19 22:13:16 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 19 Nov 1997 14:13:16 PST
Subject: [DOC-SIG] Postscript vs. PDF
In-Reply-To: <199711191643.LAA02786@eric.CNRI.Reston.Va.US>
References: <199711191643.LAA02786@eric.CNRI.Reston.Va.US>
Message-ID: <koQqFw4B0KGWEpnQ1K@holmes.parc.xerox.com>

Excerpts from ext.python: 19-Nov-97 [DOC-SIG] Postscript vs. PDF Guido
van Rossum@CNRI.Re (1340)

> BTW I believe that in order to create PDF one normally needs to buy an
> Adobe program.  Is there a free alternative?

I think you have to be more ambitious than that.  There's no point in
creating PDF that's the same as Postscript -- no links, no cut&paste of
text.  So the tool chain has to be free, support links, and support text
selection.

Bill

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Wed Nov 19 23:51:49 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Wed, 19 Nov 1997 18:51:49 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <1.5.4.32.19971117151330.0092f458@zip.com.au>
References: <1.5.4.32.19971117151330.0092f458@zip.com.au>
Message-ID: <199711192351.SAA26403@weyr.cnri.reston.va.us>


John,
  Since I haven't seen any responses to your note, or at least missed
that they were responding to, I'd like to address your points and
explain at least my position.  Of, course, it'll just be my opinion,
so no promises it's worth anything.  ;-)

John Skaller writes:
 > My opinion is that fixing a format is the best way to exclude
 > most potential submitters. But if a format has to be
 > picked, it had better be ordinary old HTML, so it can
 > be put up on a website and used immediately by everyone.

  This has two distinct aspect: submission format and dissemination
format.
  I think it's possible to support multiple submission formats, but
only if clearly repeatable conversions to the cononical format is
possible in an automated fashion.  I'm not convinced that this is
practical; how many formats have you seen that are well-aligned?

 > The tree and subtrees should be available compressed.
 > That can be done automatically by some newer ftp servers.
 > Not everyone is online all the time!

  This is a disseminiation issue; python.org already has multiple
formats of output, including an HTML package you can download &
install.

 > Where are we going to get programmers who can do this
 > work without the documentation for them to learn Python?

  Who doesn't have access to at least the free tutorial?  It's
available in several formats.

 > WHO is going to convert submitted LaTeX to HTML?
 > So, I write a doc using Guido's latex style.
 > How long until someone converts it and posts it
 > to the website?

  Are you asking that documents get added as they are submitted?  The
published documents all correspond to the version of Python with which 
they were submitted.  Are you asking that there be a documentation
section corresponding to the contrib section of the site?  That may be 
possible with a shared submission format, but only if the submitted
documents can be verified.  So far, SGML is the only format which
allows this.  It's the only one I'm aware of.
  Aside from the technical issue, there are other reasons not to
publish documents which have not been checked by a person.

 > To start off, why not accept documents in
 > _several_ formats. HTML, Postscript, dvi, and perhaps
 > a Guido-restricted LaTeX -- assuming Guido is
 > willing to do the conversion. No? Then we can't

  Obviously, Guido isn't too interested in doing conversions, and
shouldn't have to.


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From raf@comdyn.com.au  Thu Nov 20 01:51:34 1997
From: raf@comdyn.com.au (raf)
Date: Thu, 20 Nov 1997 12:51:34 +1100
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <199711200151.MAA07723@mali.cd.comdyn.com.au>

Here are some opinions regarding the choice of doc
submission formats and what will deter contributors from
submitting docs:

>John Skaller writes:
> > My opinion is that fixing a format is the best way to exclude
> > most potential submitters. But if a format has to be
> > picked, it had better be ordinary old HTML, so it can
> > be put up on a website and used immediately by everyone.

Many years ago, I was unfortunate enough to have to use troff.
I detested it for what I believe are the obvious reasons.
I've looked at TeX but wasn't unfortunate enough to have to use it.
Of this, I am glad. I've looked at Lout - far less repulsive to the eye
but they are *all* dreadful. TeX and Lout may produce lovely documents
but troff's and TeX's user interfaces (i.e. syntax) are foul and as far
as I could see, there were all *output* formats :) I could never help
but think:

    "This could never have been intended for humans.
     Surely programs are meant to generate and read this.
     This must be against the Geneva Convention!."

If you've already invested time in learning TeX, I can see that you'd
be willing to use it, but what proportion of actual and *potential* python
contributors are TeX literate? What will that figure be ten years from now?

I can't believe that those who don't already know TeX would be
happy about having to learn it (What would be the point? Suffer pain
just for Python docs? No thanks). SGML/XML is another system I know
bugger all about but the difference is severe genericity and a
much better user interface (i.e. syntax). It doesn't look painful at all.
And learning SGML doesn't feel like a dead end. I think far more people,
given a choice, would be willing to learn SGML rather than TeX.
If they can cope with HTML, they can cope with SGML.

What I'd like to see (to enhance public contribution to the docs):

    1) Accept SGML
    2) Select/Create a DTD
    3) Implement an wysiwyg-ish editor for that DTD in Python
    4) Distribute the editor along with Python

Then it would be easy for people to contribute docs.

If TeX is the choice chosen, at least:

    1-4) above
    5) Define the process for converting the aforementioned DTD into TeX
    6) Distribute that along with Python (and the editor) as well


raf


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Wed Nov 19 15:25:46 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 19 Nov 1997 10:25:46 -0500
Subject: [DOC-SIG] Library reference SGML plan / size of supplied documentation
References: <01BCF4CD.107E7FD0.richardf@redbox.net> <199711191437.JAA27608@eric.CNRI.Reston.Va.US>
Message-ID: <347304FA.BD56D829@technologist.com>

Guido van Rossum wrote:
> Really?  If we distribute HTML for online viewing, does PDF still
> have any advantages over PostScript for printing?  I've rarely had
> complaints about the PostScript I've been distributing so far.  Having
> to dwonload the viewer is actually a deterrent for those people who
> haven't already got one.

I still prefer PDF. I can read it onscreen if that makes sense (e.g. to
delete the HTML files and save disk space, or to do a linear "eyeball
search" through the document) because the viewer is much better. On the
other hand, if I was on a Unix box with a PostScript printer, I might
feel otherwise. Anyhow, it is easy to make both if we can make TeX. You
could ship the HTML for day to day use and allow people to download the
printable version of their choice.

 Paul Prescod


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From CYBERBUSINESS@juno.com  Fri Nov 21 06:38:06 1997
From: CYBERBUSINESS@juno.com (A FRIEND)
Date: Fri, 21 Nov 1997 01:38:06 -0500
Subject: No subject
Message-ID: <19943672.886214@relay.comanche.denmark.eu> Thursday, November 20th, 1997

Authenticated sender is <cyberbusiness@juno.com>
Subject:  CHECK THIS OUT
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Hello my name is Charles and i pardon the intrusion , But i just=
 think it is so cool how this program not only made me money ,=
 But the 4 reports taught me how to bulk email.  I procrastinated=
 for a long time thinking it would never work , But after i did=
 it I found out that the informaton in the REports themselves
was worth $20.00 and you can resale them that is great.i learned=
 so much about the internet and bulk emailing that i didnt know,=
 so if you are curious about the internet this is a good program=
 but the decision is up to you i am glad i did it


Subj:=09 >>>  $36,000 IN 14 WEEKS  <<<
Date:=0997-11-15 02:58:57 EST
From:=0908339455@aol.com
To:=09Friend@public.com

I Never Thought I'd Be the One Telling You This:

I Actually Read a Piece of E-Mail & I'm Going to Europe on the=
 Proceeds!

Hello!

My name is Karen Liddell; I'm a 35-year-old mom, wife, and=
 part-time accountant.  As a rule, I delete all unsolicited=
 "junk" e-mail and use my account primarily for business.  I=
 received what I assumed was this same e-mail countless times and=
 deleted it each time.

About two months ago I received it again and, because of the=
 catchy subject line,  I finally read it.  Afterwards, I thought=
 , "OK, I give in, I'm going to try this.  I can certainly afford=
 to invest $20 and, on the other hand, there's nothing wrong with=
 creating a little excess cash."  I promptly mailed four $5 bills=
 and, after receiving the reports, paid a friend of mine a small=
 fee to send out some e-mail advertisements for me.  After=
 reading the reports, I also learned how easy it is to bulk=
 e-mail for free!

I was not prepared for the results.  Everyday for the last six=
 weeks, my P.O. box has been overflowing with $5 bills; many days=
 the excess fills up an extra mail bin and I've had to upgrade to=
 the corporate-size box!  I am stunned by all the money that=
 keeps rolling in!

My husband and I have been saving for several years to make a=
 substantial downpayment on a house.  Now, not only are we=
 purchasing a house with 40% down, we're going to Venice, Italy=
 to celebrate!

I promise you, if you follow the directions in this e-mail and be=
 prepared to eventually set aside about an hour each day to=
 follow up (and count your money!), you will make at least as=
 much money as we did.  You don't need to be a wiz at the=
 computer, but I'll bet you already are.   If you can open an=
 envelope, remove the money, and send an e-mail message, then=
 you're on your way to the bank.  Take the time to read this so=
 you'll understand how easy it is.  If I can do this, so can=
 you!

                       GO FOR IT NOW!!

                       Karen Liddell

The following is a copy of the e-mail I read:

$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$=


This is a LEGAL, MONEY-MAKING PHENOMENON.
PRINT this letter, read the directions, THEN READ IT AGAIN !!!

You are about to embark on the most profitable and unique program=
 you may ever see.  Many times over, it has demonstrated and=
 proven its ability to generate large amounts of cash.  This=
 program is showing fantastic appeal with a huge and ever-growing=
 on-line population desirous of additional income.

This is a legitimate, LEGAL, money-making opportunity.  It does=
 not require you to come in contact with people, do any hard=
 work, and best of all, you never have to leave the house, except=
 to get the mail and go to the bank!

This truly is that lucky break you've been waiting for!  Simply=
 follow the easy instructions in this letter, and your financial=
 dreams will come true! When followed correctly, this electronic,=
 multi-level marketing program works perfectly...100% EVERY=
 TIME!

Thousands of people have used this program to:
    -  Raise capital to start their own business
    -  Pay off debts
    -  Buy homes, cars, etc.,
    -  Even retire!

This is your chance, so don't pass it up!

-----------------------------------------------------------------=
-----------------
OVERVIEW OF THIS EXTRAORDINARY
ELECTRONIC MULTI-LEVEL MARKETING PROGRAM
-----------------------------------------------------------------=
-----------------

Basically, this is what we do:

We send thousands of people a product for $5.00 that costs next=
 to nothing to produce and e-mail. As with all multi-level=
 businesses, we build our business by recruiting new partners and=
 selling our products.  Every state in the U.S. allows you to=
 recruit new multi- level business online (via your computer).

The products in this program are a series of four business and=
 financial reports costing $5.00 each.  Each order you receive=
 via "snail mail" will include:

  * $5.00 cash
  * The name and number of the report they are ordering
  * The e-mail address where you will e-mail them the report they=
 ordered.

To fill each order, you simply e-mail the product to the buyer. =
 THAT'S IT!  The $5.00 is yours!  This is the EASIEST electronic=
 multi-level marketing business anywhere!

FOLLOW THE INSTRUCTIONS TO THE LETTER AND
BE PREPARED TO REAP THE STAGGERING BENEFITS!

******* I  N  S  T  R  U  C  T  I  O  N  S *******

This is what you MUST do:

1. Order all 4 reports shown on the list below (you can't sell=
 them if you don't order them).

     *  For each report, send $5.00 CASH, the NAME & NUMBER OF=
 THE
        REPORT YOU ARE ORDERING, YOUR E-MAIL ADDRESS, and YOUR
        RETURN POSTAL ADDRESS (in case of a problem) to the=
 person whose
        name appears on the list next to the report.

     *  When you place your order, make sure you order each of=
 the four
        reports.  You will need all four reports so that you can=
 save them
        on your computer and resell them.

     *  Within a few days you will receive, via e-mail, each of=
 the four reports.
         Save them on your computer so they will be accessible=
 for you to send
         to the 1,000's of people who will order them from you.

2.  IMPORTANT-- DO NOT alter the names of the people who are=
 listed next
     to each report, or their sequence on the list, in any way=
 other than is
     instructed below in steps "a" through "d" or you will lose=
 out on the
     majority of your profits.  Once you  understand the way this=
 works, you'll
     also see how it doesn't work if you change it.  Remember,=
 this method
     has been tested, and if you alter it, it will not work.

    a.  Look below for the listing of available reports.

    b.  After you've ordered the four reports, replace the name=
 and address
         under REPORT #1 with your name and address, moving the=
 one that
         was there down to REPORT #2.

    c.  Move the name and address that was under REPORT #2 down=
 to
         REPORT #3.

    d.  Move the name and address that was under REPORT #3 down=
 to
         REPORT #4.

    e.  The name and address that was under REPORT #4 is removed=
 from
         the list and has NO DOUBT collected their 50 grand.

Please make sure you copy everyone's name and address=
 ACCURATELY!!!

3.  Take this entire letter, including the modified list of=
 names, and save
     it to your computer.  Make NO changes to the instruction=
 portion of this
     letter.

4.  Now you're ready to start an advertising campaign on the
     WORLDWIDE WEB!  Advertising on the WEB is very, very=
 inexpensive,
     and there are HUNDREDS of FREE places to advertise. Another
     avenue which you could use for advertising is e-mail lists.
     You can buy these lists for under $20/2,000 addresses or=
 you
     can pay someone a minimal charge to take care of it for=
 you.
     BE SURE TO START YOUR AD CAMPAIGN IMMEDIATELY!

5.  For every $5.00 you receive, all you must do is e-mail them=
 the report
     they ordered.  THAT'S IT!  ALWAYS PROVIDE SAME-DAY SERVICE
     ON ALL ORDERS!  This will guarantee that the e-mail THEY=
 send out,
     with YOUR name and address on it, will be prompt because=
 they can't
     advertise until they receive the report!

------------------------------------------
AVAILABLE REPORTS
------------------------------------------
***Order Each REPORT by NUMBER and NAME***

Notes:
-  ALWAYS SEND $5 CASH FOR EACH REPORT
-  ALWAYS SEND YOUR ORDER VIA FIRST CLASS  MAIL
-  Make sure the cash is concealed by wrapping it in at least two=
 sheets of paper
-  On one of those sheets of paper, include: (a) the number &=
 name of the report you are ordering, (b) your e-mail address,=
 and (c) your postal address.
_________________________________________________________________=

REPORT #1 "HOW TO MAKE $250,000 THROUGH MULTI-LEVEL SALES"

ORDER REPORT #1 FROM:
           MGL Enterprises
           8100 W. Crestline Ave.
           A-15 Box 120
           Littleton, CO 80123-1200
_________________________________________________________________=

REPORT #2 "MAJOR CORPORATIONS AND MULTI-LEVEL SALES"

ORDER REPORT #2 FROM:
            RD Rhodes
        p.o box 53372
      Indpls,in 46253-3372
_________________________________________________________________=

REPORT #3 "SOURCES FOR THE BEST MAILING LISTS"

ORDER REPORT #3 FROM:
            N. B. Bostrom
            3871 HWY 527
            HAUGHTON, LA 71037
_________________________________________________________________=

REPORT #4 "EVALUATING MULTI-LEVEL SALES PLANS"

ORDER REPORT #4 FROM:
            BUSSINESS SERVICES UNLIMITED
         P.O BOX 241075
         INDPLS,IN 46224-1075
_________________________________________________________________=

-----------------------------------------------------------------=
---------------------------------
HERE'S HOW THIS AMAZING PLAN WILL MAKE YOU $MONEY$
-----------------------------------------------------------------=
---------------------------------

Let's say you decide to start small just to see how well it=
 works. Assume your goal is to get 10 people to participate on=
 your first level. (Placing a lot of FREE ads on the internet=
 will EASILY get a larger response.) Also assume that everyone=
 else in YOUR ORGANIZATION gets ONLY 10 downline members.  Follow=
 this example to achieve the STAGGERING results below.

1st level--your 10 members with=
 $5...........................................$50
2nd level--10 members from those 10 ($5 x=
 100)..................$500
3rd level--10 members from those 100 ($5 x=
 1,000)..........$5,000
4th level--10 members from those 1,000 ($5 x 10,000)...$50,000
                                                   THIS TOTALS   =
     ----------->$55,550

Remember friends, this assumes that the people who participate=
 only recruit 10 people each.  Think for a moment what would=
 happen if they got 20 people to participate!  Most people get=
 100's of participants! THINK ABOUT IT!

Your cost to participate in this is practically nothing (surely=
 you can afford $20). You obviously already have an internet=
 connection and e-mail is FREE!!! REPORT#3 shows you the most=
 productive methods for bulk e-mailing and purchasing e-mail=
 lists.  Some list & bulk e-mail vendors even work on trade!

About 50,000 new people get online every month!

*******TIPS FOR SUCCESS*******

 *  TREAT THIS AS YOUR BUSINESS!  Be prompt, professional, and=
 follow
     the directions accurately.

 *  Send for the four reports IMMEDIATELY so you will have them=
 when
    the orders start coming in because:

    When you receive a $5 order, you MUST send out the requested
    product/report to comply with the U.S. Postal & Lottery Laws,=
 Title
    18,Sections 1302 and 1341 or Title 18,  Section 3005 in the=
 U.S. Code,
    also Code of Federal Regs. vol. 16, Sections 255 and 436,=
 which state
    that "a product or service must be exchanged for money=
 received."

 *  ALWAYS PROVIDE SAME-DAY SERVICE ON THE ORDERS YOU RECEIVE.

 *  Be patient and persistent with this program. If you follow=
 the
    instructions exactly, the results WILL undoubtedly be=
 SUCCESSFUL!

 *  ABOVE ALL, HAVE FAITH IN YOURSELF AND KNOW YOU WILL SUCCEED!

*******YOUR SUCCESS GUIDELINE*******

Follow these guidelines to guarantee your success:

If you don't receive 10 to 20 orders for REPORT #1 within two=
 weeks, continue advertising until you do.  Then, a couple of=
 weeks later you should receive at least 100 orders for REPORT=
 #2.  If you don't, continue advertising until you do.  Once you=
 have received 100 or more orders for REPORT #2, YOU CAN RELAX,=
 because the system is already working for you, and the cash will=
 continue to roll in!

THIS IS IMPORTANT TO REMEMBER:

Every time your name is moved down on the list, you are placed in=
 front of a DIFFERENT report.  You can KEEP TRACK of your=
 PROGRESS by watching which report people are ordering from you. =
 If you want to generate more income, send another batch of=
 e-mails and start the whole process again!  There is no limit to=
 the income you will generate from this business!

NOTE:  If you need help with starting a business, registering a=
 business name,                  how income tax is handled, etc.,=
 contact your local office of the Small Business Administration=
 (a Federal agency) for free help and answers to questions. Also,=
 the Internal Revenue Service offers free help via telephone and=
 free seminars about business taxes.

*******T  E  S  T  I  M  O  N  I  A  L  S*******

     This program does work, but you must follow it EXACTLY! =
 Especially the rule of not trying to place your name in a=
 different position, it won't work and you'll lose a lot of=
 potential income.  I'm living proof that it works.  It really is=
 a great opportunity to make relatively easy money, with little=
 cost to you.  If you do choose to participate, follow the=
 program exactly, and you'll be on your way to financial=
 security.
          Sean McLaughlin, Jackson, MS

     My name is Frank. My wife, Doris, and I live in Bel-Air, MD.=
 I am a cost accountant with a major U.S. Corporation and I make=
 pretty good money. When I received the program I grumbled to=
 Doris about receiving "junk mail." I made fun of the whole=
 thing, spouting my knowledge of the population and percentages=
 involved.  I "knew" it wouldn't work. Doris totally ignored my=
 supposed intelligence and jumped in with both feet. I made=
 merciless fun of her, and was ready to lay the old "I told you=
 so" on her when the thing didn't work... well, the laugh was on=
 me!  Within two weeks she had received over 50 responses. Within=
 45 days she had received over $147,200 in $5 bills! I was=
 shocked!  I was sure that I had it all figured and that it=
 wouldn't work.  I AM a believer now. I have joined Doris in her=
 "hobby."   I did have seven more years until retirement, but I=
 think of the "rat race" and it's not for me. We owe it all to=
 MLM.
           Frank T., Bel-Air, MD

    I just want to pass along my best wishes and encouragement to=
 you.  Any doubts you have will vanish when your first orders=
 come in. I even checked with the U.S. Post Office to verify that=
 the plan was legal. It definitely is! IT WORKS!!!
           Paul Johnson, Raleigh, NC

    The main reason for this letter is to convince you that this=
 system is honest, lawful, extremely profitable, and is a way to=
 get a large amount of money in a short time. I was approached=
 several times before I checked this out. I joined just to see=
 what one could expect in return for the minimal effort and money=
 required.  To my astonishment, I received $36,470.00 in the=
 first 14 weeks, with money still coming in.
           Sincerely yours, Phillip A. Brown, Esq.

    Not being the gambling type, it took me several weeks to make=
 up my mind to participate in this plan. But conservative that I=
 am, I decided that the initial investment was so little that=
 there was just no way that I wouldn't get enough orders to at=
 least get my money back. Boy, was I surprised when I found my=
 medium-size post office box crammed with orders!  For awhile, it=
 got so overloaded that I had to start picking up my mail at the=
 window. I'll make more money this year than any 10 years of my=
 life before. The nice thing about this deal is that it doesn't=
 matter where in the U.S. the people live. There simply isn't a=
 better investment with a faster return.
         Mary Rockland, Lansing, MI

    I had received this program before. I deleted it, but later I=
 wondered if I shouldn't have given it a try. Of course, I had no=
 idea who to contact to get another copy, so I had to wait until=
 I was e-mailed another program...11 months passed then it=
 came...I didn't delete this one!...I made more than $41,000 on=
 the first try!!
          D. Wilburn, Muncie, IN

     This is my third time to participate in this plan. We have=
 quit our jobs, and will soon buy a home on the beach and live=
 off the interest on our money.  The only way on earth that this=
 plan will work for you is if you do it. For your sake, and for=
 your family's sake don't pass up this golden opportunity.  Good=
 luck and happy spending!
           Charles Fairchild, Spokane, WA

ORDER YOUR REPORTS TODAY AND GET
STARTED ON YOUR ROAD TO
FINANCIAL FREEDOM!!!


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From jrush@summit-research.com (Jeff Rush)  Fri Nov 21 07:28:27 1997
From: jrush@summit-research.com (Jeff Rush) (Jeff Rush)
Date: Fri, 21 Nov 97 01:28:27 cst
Subject: [DOC-SIG] Postscript vs. PDF
Message-ID: <199710210130.1018568.7@mail.bkbank.com>

On Wed, 19 Nov 1997 11:43:34 -0500 Guido wrote:

>So I propose the following set of distributions:
>
>    - Full source distribution, containing C source, Latex doc source, and
>    the standard library
>
>    - Small source distribution, containing C source and the standard library
>
>    - HTML distribution
>
>    - PostScript distribution
>
>    - PDF distribution
>
>    - Platform specific Unix binary distributions; these don't include
>    HTML nor the standard library, only the python binary and dynamically
>    loaded extensions (if applicable)
>
>    - Addeddum for platform specific Unix binary distributions, containing
>    the standard library and the HTML docs

Hmmm, maybe python.org (CNRI?) personnel don't manage the non-Unix stuff due
to lack of PCs in-house but I don't see any Non-Unix distributions in your
list above.  I'm certainly willing to provide a complete OS/2 binaries set at
any point in time.

For the PC world, I'd propose:

    End-User Distribution:
        Platform-specific binaries and dyn extensions, standard library,
        and HTML documentation.

    This package lets a new user pick his platform and get up and running
    as quickly as possible, and gives him a single download.  New users
    often want to hit-and-run a web site and get enough to evaluate whether
    to invest more time.  I would provide docs in HTML only here, to minimize
    the user's investment in effort.


    Developer-Kit Addendum Distribution:
        Include files, link libraries

    This package is for the part-time developer who wants to extend or
    embed Python but not become a full kernel developer.  His work would
    take the form of DLLs that get imported by a Python script.


    Full Developer Distribution:
        C sources, include files, standard library

    This package is for the developer who wants to totally rebuild Python
    and hence includes no binaries.  Such a developer will have diverse
    tastes in docs and will pick his flavor from one of the below:


    HTML Distribution:
        Only documentation

    PDF Distribution:
        Only documentation

    Postscript Distribution:
        Only documentation

And as I said, I've willing to echo back any release in
platform-specific format.  I'd just like to see the platform
section on the main www.python.org page cleaned up a bit,
listing all platforms supported and giving an easy way to
download.  Right now you have to rummage around a bit in the
FTP area.

Jeff Rush


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From Fred L. Drake, Jr." <fdrake@acm.org  Fri Nov 21 14:56:48 1997
From: Fred L. Drake, Jr." <fdrake@acm.org (Fred L. Drake)
Date: Fri, 21 Nov 1997 09:56:48 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <1.5.4.32.19971120043327.00942adc@zip.com.au>
References: <1.5.4.32.19971120043327.00942adc@zip.com.au>
Message-ID: <199711211456.JAA08762@weyr.cnri.reston.va.us>


(In this post I quote from a message John sent directly to me; he gave 
permission to quote this message to this forum.  I quote more than I
normally would since the original message wasn't sent to the doc-sig
list.)

I wrote:
 > >  I think it's possible to support multiple submission formats, but
 > >only if clearly repeatable conversions to the cononical format is
 > >possible in an automated fashion.  

John Skaller wrote:
 >         There doesn't have to be a "canonical" format.
 > (Although it may be useful)

  I disagree.  To allow maintaining the documentation, we need to have 
a common format which provides an interlingua for conversion.  This
does not require that the submission format(s) are the same as the
canonical format.

I wrote:
 > >  Who doesn't have access to at least the free tutorial?  It's
 > >available in several formats.

John replied:
 >         I agree. And it is a good document. 
 > 
 >         But it is not enough. 
 > 
 >         I am continually seeking information -- being
 > a Python newbie -- and I'm having a LOT of trouble finding it.

  This is a problem.  If you can tell us what you're looking for and
how you went about the search (esp. what you tried first), we can see
about improving the situation.  But that's hard to do without the
feedback, especially for those of us who've been at it a few years.

I asked:
 > >  Are you asking that documents get added as they are submitted?  

And John replied:
 >         Yes. Immediately. Naturally, they should be
 > classified "unmoderated" or whatever. 
 > 
 >         If this is NOT done, I will be greatly discouraged
 > from submitting articles. 

  This is an interesting approach.  It looks like what's needed is a
"knowledgebase" in addition to the standard distribution.  I think for 
now we've been looking only at dealing with the Library Reference, but 
your comment introduces a couple of aspects that should be addressed:

1.  Providing revisions to the Library Reference as they become
    available.  I think this is a good idea, though I'm not sure that
    "immediately" needs to be immediately or "within a short period of 
    time"; as far as python.org and the Doc-SIG is concerned, we're
    all volunteers.

    I think that as sections are checked and placed/replaced in the
    Library Reference, the online HTML can be updated and the
    distribution archives can be updated on a periodic basis (monthly
    perhaps)?  This is a good reason to provide a separation between
    the documentation are source/library archives.

    The conversions from canonical format to distribution formats must 
    be completely automated for this to be feasible, or the cost in
    person-hours is too high.  Tarball & Zipball(?) creation must also
    be automated, but that's trivial once the conversions are
    automated.

2.  An online knowledgebase of HOW-TO articles, FAQs, and the like
    needs to be available.  This could be updated in a more continuous 
    fashion, with distribution packages produced in a similar way to
    the primary documentation packages.

    This probably lends itself to a simpler input format, perhaps
    allowing HTML and structured text as inputs, with conversion to an 
    internal format done behind the scenes.

I said:
 > >That may be 
 > >possible with a shared submission format, but only if the submitted
 > >documents can be verified.  So far, SGML is the only format which
 > >allows this.  

John said:
 >         HTML is verifiable, isn't it?

  Yes.  It is an SGML application.  The problem with using HTML as the 
canonical format for the Library Reference is that it is
insufficiently structure.  While HTML 4.0 might allow the structure to 
be imposed using CLASS=<???> attributes all over the place, that would 
require custom verification software to be written; this should be
avoided if at all possible.

 > >  Aside from the technical issue, there are other reasons not to
 > >publish documents which have not been checked by a person.
 > 
 >         Yes, but there are levels of checking.

  I wasn't refering so much for format checking (which should be done
in software whenever possible), or to accuracy checking (which would
be nice, but can be handled by responding to bug reports).  I was
thinking more of the malicious user (not a Python user, certainly!)
posting something obscene or not related to Python.  This sort of
thing is not something that software can effectively check for and
*must* be checked before a document can be made available on
python.org.  No matter what disclaimers are in place, a presentation
including such garbage would do nothing but damage Python's
reputation.  This is something which must be guarded against very
carefully, and this can (at this time) only be done by human
inspection of documents.

  It sounds as if there's a lot of work ahead.  Does anyone know of
existing "knowledgebase" systems that allow document updates and
interdocument linking, at least in the form of See-Also's?  I think it 
would be nice to have something better than dejanews searches.
  My expectation is that the Library Reference project needs to be
done first, primarily because the problem is better understood and we
have more concrete notions about what should be done about it.
Discussion on these other ideas doesn't need to wait, however.  Ideas, 
anyone?  ;-)


  -Fred

--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA    20191-5434

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From amk@magnet.com  Fri Nov 21 16:15:46 1997
From: amk@magnet.com (Andrew Kuchling)
Date: Fri, 21 Nov 1997 11:15:46 -0500 (EST)
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711211456.JAA08762@weyr.cnri.reston.va.us>
 (fdrake@CNRI.Reston.Va.US)
Message-ID: <199711211615.LAA25354@lemur.magnet.com>

"Fred L. Drake" <fdrake@CNRI.Reston.Va.US> wrote:
>2.  An online knowledgebase of HOW-TO articles, FAQs, and the like
>    needs to be available.  This could be updated in a more continuous 
>    fashion, with distribution packages produced in a similar way to
>    the primary documentation packages.

	This would be very useful, but why does it need to have any
constraints on input format at all?

	We already have a "Hints and Guides" page at
http://www.python.org/doc/Hints.html, which is a good starting point.
The only problem is that it doesn't aim for completeness, and you
can't search the text of the linked-to articles and guides .

	For the first problem, hopefully before too long we'll have
some sort of automated system for adding links, similar to FAQwizard.
(This would be useful for both the contributed software pages, and the
hints & guides page; they could be made comprehensive without making
Ken Manheimer spend all his time maintaining them.)  I don't know how
to solve the second problem; you can't really have the Ultraseek
server on python.org crawl those pages, because it's not known how far
to crawl.

>  It sounds as if there's a lot of work ahead.  Does anyone know of
>existing "knowledgebase" systems that allow document updates and
>interdocument linking, at least in the form of See-Also's?  I think it 
>would be nice to have something better than dejanews searches.

	BSCW, maybe?  But there's a BSCW server on starship, and I
don't think many people use it.  We have one at Magnet, too, and it
doesn't get much use, either.


	Andrew Kuchling
	amk@magnet.com
	http://starship.skyport.net/crew/amk/

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From papresco@technologist.com  Thu Nov 20 19:31:10 1997
From: papresco@technologist.com (Paul Prescod)
Date: Thu, 20 Nov 1997 14:31:10 -0500
Subject: [DOC-SIG] Postscript vs. PDF
References: <199711191643.LAA02786@eric.CNRI.Reston.Va.US> <koQqFw4B0KGWEpnQ1K@holmes.parc.xerox.com>
Message-ID: <34748FFE.58AF48C8@technologist.com>

Bill Janssen wrote:
> I think you have to be more ambitious than that.  There's no point in
> creating PDF that's the same as Postscript -- no links, no cut&paste of
> text.  So the tool chain has to be free, support links, and support text
> selection.

I believe PDFTex meets all requirements. I don't know how easy it is to
install on top of an existing TeX distribution, though. I've never
tried.

 Paul Prescod


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From guido@CNRI.Reston.Va.US  Fri Nov 21 18:48:31 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Fri, 21 Nov 1997 13:48:31 -0500
Subject: [DOC-SIG] Postscript vs. PDF
In-Reply-To: Your message of "Fri, 21 Nov 1997 01:28:27 CST."
 <199710210130.1018568.7@mail.bkbank.com>
References: <199710210130.1018568.7@mail.bkbank.com>
Message-ID: <199711211848.NAA09874@eric.CNRI.Reston.Va.US>

> >So I propose the following set of distributions:
> >
> >    - Full source distribution, containing C source, Latex doc source, and
> >    the standard library
> >
> >    - Small source distribution, containing C source and the standard library
> >
> >    - HTML distribution
> >
> >    - PostScript distribution
> >
> >    - PDF distribution
> >
> >    - Platform specific Unix binary distributions; these don't include
> >    HTML nor the standard library, only the python binary and dynamically
> >    loaded extensions (if applicable)
> >
> >    - Addeddum for platform specific Unix binary distributions, containing
> >    the standard library and the HTML docs
> 
> Hmmm, maybe python.org (CNRI?) personnel don't manage the non-Unix stuff due
> to lack of PCs in-house but I don't see any Non-Unix distributions in your
> list above.

No, I simply wasn't thinking of that yet, as it is being done somewhat
separately.

> I'm certainly willing to provide a complete OS/2 binaries set at
> any point in time.

Thanks!

> For the PC world, I'd propose:

I can do the PC distributions but not PythonWin.  PythonWin (and Mark
Hammond's other stuff like COM support and Active Scripting and
Debugging) will be distributed separately, as add-ons.  But the core
will be coming from me.

>     End-User Distribution:
>         Platform-specific binaries and dyn extensions, standard library,
>         and HTML documentation.

Platform specific?  The only platforms I can currently support are
Intel running Windows 95 or NT, and these can be one distribution.

>     This package lets a new user pick his platform and get up and running
>     as quickly as possible, and gives him a single download.  New users
>     often want to hit-and-run a web site and get enough to evaluate whether
>     to invest more time.  I would provide docs in HTML only here, to minimize
>     the user's investment in effort.

Agreed completely, with the proviso that the CNRI distribution won't
contain a fancy IDE -- it will require youto use notepad (or whatever
editor you chose for plain text) to edit .py files and run them in a
DOS box.  It *will* support Tkinter, but you have to install Tcl/Tk
separately (I can provide a link to the download though).

>     Developer-Kit Addendum Distribution:
>         Include files, link libraries
> 
>     This package is for the part-time developer who wants to extend or
>     embed Python but not become a full kernel developer.  His work would
>     take the form of DLLs that get imported by a Python script.

This seems nice on the face of it, but I doubt that it is very
useful.  In practice, until we improve the documentation quite a bit,
most such developers end up having to read the source even if they
stay away from recompiling it.  In the future when I get my
documentation act together this would be useful though.

>     Full Developer Distribution:
>         C sources, include files, standard library
> 
>     This package is for the developer who wants to totally rebuild Python
>     and hence includes no binaries.  Such a developer will have diverse
>     tastes in docs and will pick his flavor from one of the below:
> 
>     HTML Distribution:
>         Only documentation
> 
>     PDF Distribution:
>         Only documentation
> 
>     Postscript Distribution:
>         Only documentation

And these can be the same ones as for the Unix distribution, I
presume.  WinZip can handle .tar.gz files just fine, so I don't see a
big reason to distribute everything twice, once as .tar.gz and once as
.zip.  I also don't see a reason to make the documentation for Windows
set different than the set for Unix, yet.

> And as I said, I've willing to echo back any release in
> platform-specific format.  I'd just like to see the platform
> section on the main www.python.org page cleaned up a bit,
> listing all platforms supported and giving an easy way to
> download.  Right now you have to rummage around a bit in the
> FTP area.

A cleanup of python.org would indeed be most welcome.  Maybe in the
new year, or when the Python Consortium takes off -- there simply
aren't enough hours in the day to do the work (besides all the other
stuff we do at CNRI).

--Guido van Rossum (home page: http://www.python.org/~guido/)


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From jrush@summit-research.com (Jeff Rush)  Sat Nov 22 04:33:13 1997
From: jrush@summit-research.com (Jeff Rush) (Jeff Rush)
Date: Fri, 21 Nov 97 22:33:13 cst
Subject: [DOC-SIG] Postscript vs. PDF
Message-ID: <199710212303.0418562.8@mail.bkbank.com>

On Fri, 21 Nov 1997 13:48:31 -0500 Guido wrote:

>> For the PC world, I'd propose:
>
>I can do the PC distributions but not PythonWin.  PythonWin (and Mark
>Hammond's other stuff like COM support and Active Scripting and
>Debugging) will be distributed separately, as add-ons.  But the core
>will be coming from me.

I agree that such platform-specific add-ons should be distributed separately.
However, that brings to mind a question -- why are Unix-variant modules (both
C and Python), such as the Irix 'al' module and the 'plat-xxx' dirs, included
in the base distribution when they are only usable by a subset of users?  Are
there any plans to 'streamline' the standard library, only keeping those modules
that are of general use to the majority and moving the others to add-on
distributions, which can also contain more package-specific documentation?

It would reduce the download size (a little), make it easier for someone
to step up and comprehend Python (when reading all of the sources like I did
during porting) and give Python a more focused look.

When doing the OS/2 port, it took time to wade through each C source file
deciding which were relevant/necessary to include in the core binary and which
were platform-specific or optional.  It also means the regression tester must
comprehend more modules.


>>     End-User Distribution:
>>         Platform-specific binaries and dyn extensions, standard library,
>>         and HTML documentation.
>
>Platform specific?  The only platforms I can currently support are
>Intel running Windows 95 or NT, and these can be one distribution.

I understand but what about the Mac version?  And then add one for OS/2 and
(soon) one for AmigaDOS.  And I assume a few Unix binaries as well.


>Agreed completely, with the proviso that the CNRI distribution won't
>contain a fancy IDE -- it will require youto use notepad (or whatever
>editor you chose for plain text) to edit .py files and run them in a
>DOS box.  It *will* support Tkinter, but you have to install Tcl/Tk
>separately (I can provide a link to the download though).

No argument here.  I don't care for fancy IDE's but I respect people who do.
I agree that such an IDE should be add-on, especially since there is no
standard GUI for Python, so an IDE would restrict it's portability.


>>     Full Developer Distribution:
>>         C sources, include files, standard library
>> 
>>     This package is for the developer who wants to totally rebuild Python
>>     and hence includes no binaries.  Such a developer will have diverse
>>     tastes in docs and will pick his flavor from one of the below:
>> 
>>     HTML Distribution:
>>         Only documentation
>> 
>>     PDF Distribution:
>>         Only documentation
>> 
>>     Postscript Distribution:
>>         Only documentation
>
>And these can be the same ones as for the Unix distribution, I
>presume.  WinZip can handle .tar.gz files just fine, so I don't see a
>big reason to distribute everything twice, once as .tar.gz and once as
>..zip.

Yes, the same ones as for Unix but remember, WinZip is only for Windows
platforms -- what about the others?  I have tar and gzip for OS/2 and
know how to use them but not everyone does.  Those tools also exist to
some degree for every known OS but we should try to remove as many
prerequisite tools as possible.  Most PC people are going to expect
ZIP files, I think.

However, the person who provides the platform-specific distribution
files can certainly untar the docs and repack them into suitable
forms for their platform.  Just leave a place for that on the web page.


>I also don't see a reason to make the documentation for Windows
>set different than the set for Unix, yet.

I'm not sure I ever see a reason for the docs to diverge, given
decent writing skills when preparing/updating them.


>> And as I said, I've willing to echo back any release in
>> platform-specific format.  I'd just like to see the platform
>> section on the main www.python.org page cleaned up a bit,
>> listing all platforms supported and giving an easy way to
>> download.  Right now you have to rummage around a bit in the
>> FTP area.
>
>A cleanup of python.org would indeed be most welcome.  Maybe in the
>new year, or when the Python Consortium takes off -- there simply
>aren't enough hours in the day to do the work (besides all the other
>stuff we do at CNRI).

Ever thought of outsourcing to some degree i.e. providing the storage
and control over distributions at CNRI as FTPable files *but* letting
the pretty web face be at Starship and volunteers write that?  I'd
like to help but I'm sure I can't get write access to CNRI's servers.
But I could set up pages elsewhere and -point- them to CNRI's files.

As you say, something to discuss in the new year...

Jeff Rush


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From skaller@zip.com.au  Sat Nov 22 18:54:28 1997
From: skaller@zip.com.au (John Skaller)
Date: Sun, 23 Nov 1997 05:54:28 +1100
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <1.5.4.32.19971122185428.00934318@zip.com.au>

At 09:56 21/11/97 -0500, Frank wrote:
> >         I am continually seeking information -- being
> > a Python newbie -- and I'm having a LOT of trouble finding it.
>
>  This is a problem.  If you can tell us what you're looking for and
>how you went about the search (esp. what you tried first), we can see
>about improving the situation.  But that's hard to do without the
>feedback, especially for those of us who've been at it a few years.

        Yes. I agree. It is why I  -- a newbie -- am talking a lot
at the moment. I know from experience how beneficial it is
to get feedback from newbies, having been on the other side of
the fence.

        What happens is: I read all the doco and know
what I can do, but forget the details of how to do it.
Then, when I want to do it:

        1) I do not know the names of the things I need,
           like attribute names, module names, class names

        2) I do not remember _where_ i read the info:
                was it in the tutorial?
                was it in the ref manual?
                in the lib manual? Which module? Which section?
                did i get it from Mark Lutz book?
                the FAQ?
                from reading the python source code
                from reading C source code
                from the USENET news?

I think this situtation could be improved by hyperlinking
_all_ the documents together. I'm not sure. 
The Tcl/Tk manual seems slightly easier to use as a manual ---
but then it is hard for me to say, since I'be been using
tcl for around a year (not a newbie anymore)

I really think excellent progress is being made writing
documentation. I tend to think we need more of it, before
organising it better. In academia, this is called
a "literature survey". :-)

I have a tool that scans the whole Python library (as installed)
and parses it into a TixTree. So I can open and close
subtrees of *.py files. If the module is well structured,
I get a beautiful outline of what it does. 
The software works for .c files too.

I am finding that this tool is _telling_ me how comments
have to be structured. It suggests a system wide index
can be generated dynamically. It would be nice to
put _hooks_ into the module doco to the corresponding
html pages. That would enable 

        a) browsing the library and hyperlinking to the doco
        b) generating some doco from the library

I'm hoping this additional way of looking at the system
will help me learn where eveything is and how it fits
together.

>1.  Providing revisions to the Library Reference as they become
>    available.  

>2.  An online knowledgebase of HOW-TO articles, FAQs, and the like
>    needs to be available.  This could be updated in a more continuous 
>    fashion, with distribution packages produced in a similar way to
>    the primary documentation packages.

Sounds like a good mix!

>The problem with using HTML as the 
>canonical format for the Library Reference is that it is
>insufficiently structure.  While HTML 4.0 might allow the structure to 
>be imposed using CLASS=<???> attributes all over the place, that would 
>require custom verification software to be written; this should be
>avoided if at all possible.

        This sounds right: I plan to build a documentation
and software database -- I will generate the web site from it,
but I will also cut books and articles from it.
I don't think _any_ format has "enough structure".
That's why I'm programming Python instead.

>  My expectation is that the Library Reference project needs to be
>done first, primarily because the problem is better understood and we
>have more concrete notions about what should be done about it.

        Well, I have a suggestion/question/idea.
One of the things I can do very easily in C++ is use 
indented // comments to present block structure.

        If I understand correctly, this won't work with 
Python # comments because they're not parsed.

        I am wondering if there could be a language token
equivalent to "#" which _was_ parsed:

        doc the rest of this line is a heading
          doc this is a subheading
            class aclass
              doc here are the methods
                doc init method
                  def __init__

I use two space indents (for reasons of print publication).
I have found that the tixTree representation of the above
structure very useful in C++. 

But I have found that python modules are uglier because the doco
isn't parsed in a suitable way, so I can't use it to control
the tree view.

Comments??
-------------------------------------------------------
John Skaller    email: skaller@zip.com.au
		http://www.zip.com.au/~skaller
		phone: 61-2-6600850
		snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From skaller@zip.com.au  Sat Nov 22 19:22:55 1997
From: skaller@zip.com.au (John Skaller)
Date: Sun, 23 Nov 1997 06:22:55 +1100
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <1.5.4.32.19971122192255.0093b268@zip.com.au>

At 11:15 21/11/97 -0500, you wrote:

>	We already have a "Hints and Guides" page at
>http://www.python.org/doc/Hints.html, which is a good starting point.
>The only problem is that it doesn't aim for completeness, and you
>can't search the text of the linked-to articles and guides .

        How do I:
                a) grab the lot to read locally
                b) contribute
-------------------------------------------------------
John Skaller    email: skaller@zip.com.au
		http://www.zip.com.au/~skaller
		phone: 61-2-6600850
		snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From amk@magnet.com  Sat Nov 22 19:40:51 1997
From: amk@magnet.com (Andrew Kuchling)
Date: Sat, 22 Nov 1997 14:40:51 -0500 (EST)
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <1.5.4.32.19971122192255.0093b268@zip.com.au> (message from John
 Skaller on Sun, 23 Nov 1997 06:22:55 +1100)
Message-ID: <199711221940.OAA26150@lemur.magnet.com>

John Skaller <skaller@zip.com.au> wrote:
>        How do I:
>                b) contribute

	Write something, put it up on the Web somewhere, and send
webmaster@python.org a note about it.

>                a) grab the lot to read locally

	There's no neat way of doing this, because the documents will
be structured differently; some will be one HTML file, some will be in
several files, etc.  Putting all these documents on www.python.org
isn't really a solution, since that will require too much work for the
webmaster.  Find a typo?  Fix it and tell webmaster.  Want to add a
link?  E-mail webmaster.

	We'd need some sort of distributed solution, via CGI scripts
or something like that.  (Or you can just house it on
starship.skyport.net, where everything gets indexed by python.org's
Ultraseek server.)


	Andrew Kuchling
	amk@magnet.com
	http://starship.skyport.net/crew/amk/


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________

From richardf@redbox.net  Tue Nov 25 09:09:29 1997
From: richardf@redbox.net (Richard Folwell)
Date: Tue, 25 Nov 1997 09:09:29 -0000
Subject: [DOC-SIG] How to dowload collections of web pages
Message-ID: <01BCF981.D73C4540.richardf@redbox.net>

> John Skaller <skaller@zip.com.au> wrote:
> >        How do I:
> >                a) grab the lot to read locally

There is a shareware program that may do what you want.  It is called Black 
Widow.  Details on:

http://www.softbytelabs.com

Short extract from the publicity material:

" BlackWidow will scan a Web site and present found files in an Explorer-like 
window. You can view various information about each file, such as size and 
date, and select files to download from the site. The site profile can be saved 
to a file for later use and merged with other profiles if needed. Password 
authentication is supported for sites that require a user name and password."

I had a quick look at it, and it seemed OK, worked as advertised.  There is a 
time and capacity limited eval that can be downloaded.

Richard Folwell

RedBox Technologies, The Media Router Company, http://www.redbox.net
Email: richardf@redbox.net


_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________