Yes, I think the library reference is a separate project from the
> tutorial. I am planning to do the tutorial in FrameMaker because it
> gives me as an author the best user interface for editing and the most
> freedom to create nice layout, and because it is essentially a
> one-author document it's no problem that not everybody can afford
> FrameMaker (as long as I can generate HTML and PostScript, which I can
> -- and there's even a version of Frame that can generate SGML although
> I don't have it). (Now that I've got a PC at home I may switch to MS
> Word too -- that's surely democratic :-)
There isn't a version of Frame that can generate SGML. There is a
version of Frame that can edit SGML. There is a subtle but important
difference. Once you start out in Frame *not* using Frame+SGML, there is
nothing that constrains you to using structures that have meaning in a
particular SGML DTD (including HTML). FrameMaker cannot thus imply
structure from your "nice layout".
I will be very curious to see how good the HTML output is, and how much
"freedom" Frame offers you without totally destroying the consistency of
your HTML output. If you use hot-pink on green to represent important
notes, how is that going to be represented in a document that makes
sense to Lynx? How will you know which FrameMaker features translate
properly into HTML and which do not? Trial and error?
Personally, I think you would be better off using Frame+SGML right off
the bat, because then you will have total control over the output, but I
will be curious to see what you get out of ordinary FrameMaker anyhow --
converting arbitrary MIF to HTML is sort of an AI project and I like to
see what's the state of the art in AI. :)
> Also the
> fact that SGML parsers that support the full syntax are either costly
> in money or in resources (few sites that I know have an SGML parser
> installed already; sgmllib.py doesn't cut it).
I don't see how James Clark's SGML parser is expensive in either money
or resources. On Windows, it takes up about 3.5 MB with the Jade SGML
conversion tool, the OLE automation library, and 3 other related SGML
tools. It is trivial to install and compile. It is actually distributed
fairly widely as an HTML checker.
> TIM, on the other
> hand, was *designed* to be trivial to parse, so you can quickly write
> a small Python script that converts it to any format you like.
Great. But using Jade, I can convert to 3 formats (RTF, MIF, TeX,
PostScript) with a single "small script" (not Python, alas). If I do
want to use Python, my script will be just as simple, but will depend on
nsgmls. And as more formats arise, they will similarly be supported. But
more important -- I shouldn't have to write the small script at all,
because it is has already been written.
How does TIM enforce the proper organization of document macros. Will it
complain if I put an @messageDef{} inside of an @argDef{}? Doesn't this
type of enforcement seem useful in a situation where many people around
the world are working on a document?
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From jim.fulton@digicool.com Wed Nov 12 18:26:25 1997
From: jim.fulton@digicool.com (Jim Fulton)
Date: Wed, 12 Nov 1997 13:26:25 -0500
Subject: [DOC-SIG] Documentation formats
References: <346943C3.91CCF8FC@technologist.com> <199711121441.JAA00616@eric.CNRI.Reston.Va.US>
<3469D7C5.F90F32F9@technologist.com> <199711121722.MAA01255@eric.CNRI.Reston.Va.US> <3469F09F.8DCBB65A@technologist.com>
Message-ID: <3469F4D1.5B5B@digicool.com>
Paul Prescod wrote:
>
> Guido van Rossum wrote:
> > I think that SGML is not fit to be typed by humans
I agree alot.
I also think TeX and it's variants are not fit to be typed
by humans.
> Hundreds of thousands of HTML page authors would be surprised to hear
> you say that!
I wouldn't be surprized. That doesn't make Guido's statement incorrect.
I'm putting my $0.02 in response to this message for no particular
reason. It seemed like as good a place as any. :-)
- With regard to doc strings, I think it is *very* important
that they be very readable in raw form. I think that one can
go a long way with tools like structured text to produce reasonably
rich output while retaining readability of source text.
This was discussed at length in the early days of the DOC sig.
I'm sure the archives contain this discussion.
- With regard to Python manuals and documentation not generated from
docstrings, I have another suggestion. I don't know for sure that
this suggestion is viable, or if someone has suggested this before.
IMO in an ideal world, people would author documentation in a modern
word processor like Frame or Word and people could share
documentation files using some neutral format. I don't know if
such a neutral format exists, although I seem to remember that at
one point, Frame had a tool for working with SGML in Framemaker.
I don't know what happened with that tool, but if it is still around,
maybe people who hate editing SGML could use Frame or some other
format
that supports SGML and other folks could hack SGML or use tools that
convert between their favorite editing environment and SGML.
Jim
--
Jim Fulton jim@digicool.com
Technical Director 540.371.6909 Python Powered!
Digital Creations http://www.digicool.com/ http://www.python.org/
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From amk@magnet.com Wed Nov 12 19:06:31 1997
From: amk@magnet.com (Andrew Kuchling)
Date: Wed, 12 Nov 1997 14:06:31 -0500 (EST)
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711121722.MAA01255@eric.CNRI.Reston.Va.US> (message from
Guido van Rossum on Wed, 12 Nov 1997 12:22:11 -0500)
Message-ID: <199711121906.OAA02865@lemur.magnet.com>
Guido van Rossum wrote:
>be used that can be converted to SGML (or XML for all I care). TIM,
>which has only one magic character (@, which isn't used in Python)
{ } are also special, aren't they? (TIM is built on top of
Texinfo, which provides output in the form of GNU Info format and .dvi
files; there are also texi2nroff and texi2html converters.)
>fits the bill -- it did one or two years when I looked into it, and
>it's only because of inertia (and a lot of other things that needed to
>happen sooner) that I haven't started using it.
Aha! What prevented you from moving to TIM? Just the work
required to convert everything, or are there pieces still missing?
For the record, I also really like TIM; it's simple enough to be
easily processed, but you can escape into TeX if required. TIM, via
Texinfo, provides functions for defining class methods and the like:
@defmethod @r{hashing objects} copy ()
Return a separate copy of this hashing object. An @code{update} to this
copy won't affect the original object.
@end defmethod
Andrew Kuchling
amk@magnet.com
http://starship.skyport.net/crew/amk/
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From da@skivs.ski.org Wed Nov 12 19:31:16 1997
From: da@skivs.ski.org (David Ascher)
Date: Wed, 12 Nov 1997 11:31:16 -0800 (PST)
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711121906.OAA02865@lemur.magnet.com>
Message-ID:
> Aha! What prevented you from moving to TIM? Just the work
> required to convert everything, or are there pieces still missing?
> For the record, I also really like TIM; it's simple enough to be
> easily processed, but you can escape into TeX if required. TIM, via
> Texinfo, provides functions for defining class methods and the like:
>
> @defmethod @r{hashing objects} copy ()
> Return a separate copy of this hashing object. An @code{update} to this
> copy won't affect the original object.
> @end defmethod
For the record, while we're at it -- TIM is what I used for the Numeric
Tutorial (which will be updated, promise, someday). It worked pretty
well. It's not all that different from LaTeX as far as the user's
experience is concerned, except that it's a better match to Python (e.g.
underscores, etc.).
I didn't try hard to get all the references, indexing, etc., right -- I
certainly didn't try to get the @node system working well, since I don't
think "real" info use was going to happen. I'm not sure how easy it would
be to make it do all we'd need. It'd be good to investigate how to
generate WinHelp files (better than the current solution, which creates a
lousy table of contents).
Overall, once I got it working (I had some configuration problems Bill
helped me with), it worked well. The Numeric tutorial is available in
HTML, tex/dvi/postscript, as well as in a very readable text only form,
which I think is quite pleasant. See
http://starship.skyport.net/~da/Python/Numeric
for the "published versions" and
http://starship.skyport.net/~da/Python/Numeric/array.tim
for one of the TIM source files, if you want a look.
--david
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From mclay@smtp.erols.com Wed Nov 12 19:31:50 1997
From: mclay@smtp.erols.com (Michael McLay)
Date: Wed, 12 Nov 1997 14:31:50 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711121713.MAA28481@lemur.magnet.com>
References: <3469D7C5.F90F32F9@technologist.com>
<199711121713.MAA28481@lemur.magnet.com>
Message-ID: <199711121931.OAA24333@fermi.eeel.nist.gov>
This bounced on my first try. Sorry if it is a repeat.
Several additional messges on the subject have arrive since I started
looking at TIM. Seems like we need to define the requirements before
we can pick between Latex, TIM, XML, FRAME or any other approach to
generating documentation. Is TIM too simple? Is XML too new? Is SGML
too complex? Is a proprietary tool detrimental to contributions of
documentation? Don't all these questions require a set of requirments
against which they can be evaluated?
Andrew Kuchling writes:
> The language reference is already in FrameMaker, but document
> formats shouldn't be multiplied. If GvR rules SGML/XML out, that's
> that, and we have to consider the other options: just add some LaTeX
> macros, or use TIM (which is built on top of Texinfo, a format which I
> quite like), or invent some pod-like format.
Since the benevolent dictator and Bill Janssen suggest TIM then why
don't we take a closer look at it and discuss what switching to TIM
would fail to support. In reviewing the TIM manual page at
ftp://ftp.parc.xerox.com/pub/ilu/2.0a11/manual-html/manual_21.html
several features make TIM look like a good option:
1) Concise syntax that is easy to integrate with Python examples
2) TIM works
3) TIM was written in Python:-) (only about 820 lines of code)
4) It looks like a markup that would be much easier to convert to
XML than Latex. (My guess is that XML will eventually become the
standard for WYSIWYG editors so the ugly tagging issue will go away.)
5) Restricted set of tags, which makes it fairly easy to learn to use
Downside:
1) Heavy dependance on external programs which may not be on every platform
MAKEINFO = '/usr/bin/makeinfo'
TEX = '/usr/bin/tex'
TEXINDEX = '/usr/bin/texindex'
DVIPS = '/usr/bin/dvips'
2) May require some work to get the reference manual indexing
working with the new tools.
3) Restricted set of tags, which makes it fairly hard to extend
(except by using macros.)
4) Mixes macro language with markup. Is this really a problem?
The TIM macros seem to primarily be used to declare context names
which are then translatable to generic typographic codes. This
should make it easier to move the tagged text to meaningful XML
tags.
Defining Domain-specific markup commands isn't docuemnted. The
documentation says it is [TBD]. I grepped for usage and found the
following. Looks pretty simple to use.
ilu-macros.tim:@timmacro var code
ilu-macros.tim:@timmacro metavar var
ilu-macros.tim:@timmacro C code
ilu-macros.tim:@timmacro C++ code
ilu-macros.tim:@timmacro command code
ilu-macros.tim:@timmacro constant code
ilu-macros.tim:@timmacro codeexample example
ilu-macros.tim:@timmacro dfn i
ilu-macros.tim:@timmacro cl code
ilu-macros.tim:@timmacro class code
ilu-macros.tim:@timmacro exception code
ilu-macros.tim:@timmacro fn code
ilu-macros.tim:@timmacro interface code
ilu-macros.tim:@timmacro java code
ilu-macros.tim:@timmacro isl code
ilu-macros.tim:@timmacro kwd code
ilu-macros.tim:@timmacro language asis
ilu-macros.tim:@timmacro m3 code
ilu-macros.tim:@timmacro macro code
ilu-macros.tim:@timmacro message code
Would TIM make a good starting point? If so, should it be modernized
to use re instead of regex and then developed into a more
full-featured markup language for Python?
An example of a TIM file is attached. The example is a snippet from
the ILU Python Tutorial. Looks pretty readable to me.
@setfilename ilu-tutorial.info
@settitle Using ILU with Python: A Tutorial
@finalout
@c $Id: tutpython.tim,v 1.8 1996/03/19 04:11:10 janssen Exp $
@ifclear largerdoc
@titlepage
@title Using ILU with Python: A Tutorial
@author Bill Janssen @code{}
@sp
Formatted @today{}.
@sp
Copyright @copyright{} 1995 Xerox Corporation@*
All Rights Reserved.
@end titlepage
@ifinfo
@node Top, ,(dir),(dir)
@top Using ILU with Python
@end ifinfo
@end ifclear
@syncodeindex pg cp
@section Introduction
This tutorial will show how to use the @system{ILU} system with the programming language @language{Python},
both as a way of developing software libraries, and as a way
of building distributed systems.
In an extended example, we'll build an @system{ILU} module that implements a simple
four-function calculator, capable of addition, subtraction,
multiplication, and division. It will signal an error if
the user attempts to divide by zero. The example demonstrates
how to specify the interface for the module; how to implement the module in @language{Python};
how to use that implementation as a simple library; how to provide the module as a remote service;
how to write a client of that remote service; and how to use subtyping to extend an object type
and provide different versions of a module. We'll also demonstrate how to use @language{OMG IDL}
with @system{ILU}, and discuss the notion of network garbage collection.
Each of the programs and files referenced in this tutorial is available
as a complete program
in a separate appendix to this document; parts of programs are quoted
in the text of the tutorial.
@page
@section Specifying the Interface
Our first task is to specify more exactly what it is we're trying
to provide. A typical four-function calculator lets a user enter
a value, then press an operation key, either +, -, /, or *,
then enter another number, then press = to actually have
the operation happen. There's usually a CLEAR button to press
to reset the state of the calculator. We want to provide something like
that.
We'll recast this a bit more formally as the @dfn{interface}
of our module; that is, the way the module will
appear to clients of its functionality. The interface
typically describes a number of function calls which can be
made into the module, listing their arguments and return types,
and describing their effects. @system{ILU} uses @dfn{object-oriented}
interfaces, in which the functions in the interface are grouped
into sets, each of which applies to an @dfn{object type}. These
functions are called @dfn{methods}.
For example, we can think of the calculator as an object type,
with several methods: Add, Subtract, Multiply, Divide, Clear, etc.
@system{ILU} provides a standard notation to write this down with,
called @dfn{ISL} (which stands for ``Interface Specification Language'').
@language{ISL} is a declarative language which can be processed
by computer programs. It allows you to define object types (with methods),
other non-object types, exceptions, and constants.
The interface for our calculator would be written in ISL as:
@codeexample
INTERFACE Tutorial;
EXCEPTION DivideByZero;
TYPE Calculator = OBJECT
METHODS
SetValue (v : REAL),
GetValue () : REAL,
Add (v : REAL),
Subtract (v : REAL),
Multiply (v : REAL),
Divide (v : REAL) RAISES DivideByZero END
END;
@end codeexample
This defines an interface @isl{Tutorial}, an exception @isl{DivideByZero},
and an object type @isl{Calculator}. Let's consider these one by one.
The interface, @isl{Tutorial}, is a way of grouping a number of type
and exception definitions. This is important to prevent collisions
between names defined by one group and names defined by another group.
For example, suppose two different people had defined two different
object types, with different methods, but both called @isl{Calculator}!
It would be impossible to tell which calculator was meant. By
defining the @isl{Calculator} object type within the scope of the
@isl{Tutorial} interface, this confusion can be avoided.
The exception, @isl{DivideByZero}, is a formal name for a particular
kind of error, division by zero. Exceptions in @system{ILU} can specify
an @dfn{exception-value type}, as well, which means that real errors
of that kind have a value of the exception-value type associated with them.
This allows the error to contain useful information about why it might
have come about. However, @isl{DivideByZero} is a simple exception,
and has no exception-value type defined. We should note that the full
name of this exception is @isl{Tutorial.DivideByZero}, but for this
tutorial we'll simply call our exceptions and types by their short name.
The object type, @isl{Calculator} (again, really @isl{Tutorial.Calculator}),
is a set of six methods. Two of those methods, @isl{SetValue} and
@isl{GetValue}, allow us to enter a number into the calculator object,
and ``read'' the number. Note that @isl{SetValue} takes a single
argument, @metavar{v}, of type @type{REAL}. @type{REAL} is a
built-in @language{ISL} type, denoting a 64-bit floating point number.
Built-in @language{ISL} types are things like @type{INTEGER} (32-bit
signed integer), @type{BYTE} (8-bit unsigned byte), and @type{CHARACTER}
(16-bit Unicode character). Other more complicated types are
built up from these simple types using @language{ISL} @dfn{type constructors},
such as @isl{SEQUENCE OF}, @isl{RECORD}, or @isl{ARRAY OF}.
Note also that @isl{SetValue} does not return a value,
and neither do @isl{Add}, @isl{Subtract}, @isl{Multiply},
or @isl{Divide}. Rather,
when you want to see what the current value of the calculator
is, you must call @isl{GetValue}, a method which has no arguments,
but which returns a @type{REAL} value, which is the value of the
calculator object. This is an arbitrary decision on our part;
we could have written the interface differently, say as
@codeexample
TYPE NotOurCalculator = OBJECT
METHODS
SetValue () : REAL,
Add (v : REAL) : REAL,
Subtract (v : REAL) : REAL,
Multiply (v : REAL) : REAL,
Divide (v : REAL) : REAL RAISES DivideByZero END
END;
@end codeexample
@noindent
-- but we didn't.
Our list of methods on @type{Calculator} is bracketed by the two
keywords @isl{METHODS} and @isl{END}, and the elements are separated
from each other by commas. This is pretty standard in @language{ISL}:
elements of a list are separated by commas; the keyword @isl{END}
is used when an explicit list-end marker is needed (but not when it's
not necessary, as in the list of arguments to a method); the list often
begins with some keyword, like @isl{METHODS}.
The @dfn{raises clause} (the list of exceptions which a method
might raise) of the method @isl{Divide} provides another example
of a list, this time with only one member, introduced by the keyword
@isl{RAISES}.
Another standard
feature of @language{ISL} is separating a name, like @isl{v},
from a type, like @type{REAL}, with a colon character. For example,
constants are defined with syntax like
@codeexample
CONSTANT Zero : INTEGER = 0;
@end codeexample
@noindent
Definitions, of interface, types, constants, and exceptions, are
terminated with a semicolon.
We should expand our interface a bit by adding more documentation
on what our methods actually do. We can do this with the @dfn{docstring}
feature of @language{ISL}, which allows the user to add arbitrary
text to object type definitions and method definitions. Using
this, we can write
@codeexample
INTERFACE Tutorial;
EXCEPTION DivideByZero
"this error is signalled if the client of the Calculator calls
the Divide method with a value of 0";
TYPE Calculator = OBJECT
COLLECTIBLE
DOCUMENTATION "4-function calculator"
METHODS
SetValue (v : REAL) "Set the value of the calculator to `v'",
GetValue () : REAL "Return the value of the calculator",
Add (v : REAL) "Adds `v' to the calculator's value",
Subtract (v : REAL) "Subtracts `v' from the calculator's value",
Multiply (v : REAL) "Multiplies the calculator's value by `v'",
Divide (v : REAL) RAISES DivideByZero END
"Divides the calculator's value by `v'"
END;
@end codeexample
@noindent
Note that we can use the @isl{DOCUMENTATION} keyword on object types
to add documentation about the object type, and can simply add documentation
strings to the end of exception and method definitions. These docstrings
are passed on to the @language{Python} docstring system, so that they are available
at runtime from @language{Python}. Documentation
strings cannot currently be used for non-object types.
@system{ILU} provides a program, @program{islscan}, which can be used
to check the syntax of an @language{ISL} specification. @program{islscan}
parses the specification and summarizes it to standard output:
@transcript
% @userinput{islscan Tutorial.isl}
Interface "Tutorial", imports "ilu"
@{defined on line 1
of file /tmp/tutorial/Tutorial.isl (Fri Jan 27 09:41:12 1995)@}
Types:
real @{, referenced on 10 11 12 13 14 15@}
Classes:
Calculator @{defined on line 17@}
methods:
SetValue (v : real); @{defined 10, id 1@}
"Set the value of the calculator to `v'"
GetValue () : real; @{defined 11, id 2@}
"Return the value of the calculator"
Add (v : real); @{defined 12, id 3@}
"Adds `v' to the calculator's value"
Subtract (v : real); @{defined 13, id 4@}
"Subtracts `v' from the calculator's value"
Multiply (v : real); @{defined 14, id 5@}
"Multiplies the calculator's value by `v'"
Divide (v : real) @{DivideByZero@}; @{defined 16, id 6@}
"Divides the calculator's value by `v'"
documentation:
"4-function calculator"
unique id: ilu:cigqcW09P1FF98gYVOhf5XxGf15
Exceptions:
DivideByZero @{defined on line 5, refs 15@}
%
@end transcript
@noindent
@program{islscan} simply lists the types defined in the interface, separating
out object types (which it calls ``classes''), the exceptions, and
the constants. Note that for the @type{Calculator} object type,
it also lists something called its @dfn{unique id}. This is a 160-bit
number (expressed in base 64) that @system{ILU} assigns automatically
to every type, as a way of distinguishing them. While
it might interesting to know that it exists (:-),
the @system{ILU} user never has know what it is; @program{islscan}
supplies it for the convenience of the @system{ILU} implementors, who
sometimes do have to know it.
@page
@section Implementing the True Module
After we've defined an interface, we then need to supply an implementation
of our module. Implementations can be done in any language supported by
@system{ILU}. Which language you choose often depends on what sort
of operations have to be performed in implementing the specific functions
of the module. Different languages have specific advantages and disadvantages
in different areas. Another consideration is whether you wish to use the
implementation mainly as a library, in which case it should probably be done
in the same language as the rest of your applications, or mainly as
a remote service, in which case the specific implementation language
is less important.
We'll demonstrate an implementation of the @type{Calculator}
object type in @system{Python}, which is one of the most capable
of all the @system{ILU}-supported languages. This is just a matter
of defining a @language{Python} class, corresponding to the @type{Tutorial.Calculator} type. Before we do that,
though, we'll explain how the names and signatures of the @language{Python} functions
are arrived at.
@subsection What the Interface Looks Like in Python
For every programming language
supported by @system{ILU}, there is a standard @dfn{mapping} defined
from @language{ISL} to that programming language. This mapping defines
what @language{ISL} type names, exception names, method names,
and so on look like
in that programming language.
The mapping for @language{Python} is straightforward. For type names,
such as @isl{Tutorial.Calculator}, the @language{Python} name
of the @language{ISL} type @isl{Interface.Name}
is @Python{Interface.Name}, with any hyphens replaced by underscores. That is, the name of the interface in @language{ISL}
becomes the name of the module in @language{Python}.
So the name of our @type{Calculator} type in @language{Python}
would be @Python{Tutorial.Calculator}, which is really the name of a @language{Python} class.
The @language{Python} mapping for a method name such as @isl{SetValue}
is the method name, with any hyphens replaced by underscores.
The return type of this @language{Python} method is whatever is specified
in the @language{ISL} specification for the method, or @Python{None} if
no type is specified. The arguments for the @language{Python} method are the
same as specified in the @language{ISL}; their types are the
@language{Python} types corresponding to the @language{ISL} types, @emph{except}
that one extra argument is added to the beginning of each @language{Python}
version of an @language{ISL} method; it is an @dfn{instance} of the object type
on which the method is defined. An instance is simply a value of that
type. Thus the @language{Python} method corresponding
to our @language{ISL} @isl{SetValue} would have the prototype signature
@codeexample
def SetValue (self, v):
@end codeexample
@noindent
Similarly, the signatures for the other methods, in @language{Python}, are
@codeexample
def GetValue (self):
def Add (self, v):
def Subtract (self, v):
def Multiply (self, v):
def Divide (self, v):
@end codeexample
@noindent
Note that even though the @isl{Divide} method can raise an exception,
the signature looks like those of the other methods. This is because
the normal @language{Python} exception signalling mechanism is used to
signal exceptions back to the caller.
The mapping of exception names is similar to the mapping used for types.
So the exception @isl{Tutorial.DivideByZero}
would also have the name @Python{Tutorial.DivideByZero}, in @language{Python}.
One way to see what all the @language{Python} names for an interface
look like is to run the program @program{python-stubber}. This program
reads an @language{ISL} file, and generates the necessary @language{Python}
code to support that interface in @language{Python}. One of the files
generated is @file{@metavar{Interface}.py}, which contains the definitions
of all the @language{Python} types for that interface.
@transcript
% @userinput{python-stubber Tutorial.isl}
client stubs for interface "Tutorial" to Tutorial.py ...
server stubs for interface "Tutorial" to Tutorial__skel.py ...
%
@end transcript
@page
@subsection Building the Implementation
To provide an implementation of our interface, we @dfn{subclass} the
generated @language{Python} class for our @class{Calculator} class:
@codeexample
# CalculatorImpl.py
import Tutorial, Tutorial__skel
class Calculator (Tutorial__skel.Calculator):
def __init__ (self):
self.the_value = 0.0
def SetValue (self, v):
self.the_value = v
def GetValue (self):
return self.the_value
def Add (self, v):
self.the_value = self.the_value + v
def Subtract (self, v):
self.the_value = self.the_value - v
def Multiply (self, v):
self.the_value = self.the_value * v
def Divide (self, v):
try:
self.the_value = self.the_value / v
except ZeroDivisionError:
raise Tutorial.DivideByZero
@end codeexample
Each instance of a @Python{CalculatorImpl.Calculator} object
inherits from @Python{Tutorial__skel.Calculator}, which in turn
inherits from @Python{Tutorial.Calculator}. Each has an instance
variable called @Python{the_value}, which maintains a running total
of the `accumulator' for that instance. We can create an instance
of a @isl{Tutorial.Calculator} object by simply calling @Python{CalculatorImpl.Calculator()}.
@page
So, a very simple program to use the @isl{Tutorial} module might be
the following:
@codeexample
# simple1.py, a simple program that demonstrates the use of the
# Tutorial true module as a library.
#
# run this with the command "python simple1.py NUMBER [NUMBER...]"
#
import Tutorial, CalculatorImpl, string, sys
# A simple program:
# 1) make an instance of Tutorial.Calculator
# 2) add all the arguments by invoking the Add method
# 3) print the resultant value.
def main (argv):
c = CalculatorImpl.Calculator()
if not c:
error("Couldn't create calculator")
# clear the calculator before using it
c.SetValue (0.0)
# now loop over the arguments, adding each in turn */
for arg in argv[1:]:
v = string.atof(arg)
c.Add (v)
# and print the result
print "the sum is", c.GetValue()
sys.exit(0)
main(sys.argv)
@end codeexample
@noindent
This program would be compiled and run as follows:
@transcript
% @userinput{python simple1.py 34.9 45.23111 12}
the sum is 92.13111
%
@end transcript
@noindent
This is a completely self-contained use of the @isl{Tutorial}
implementation; when a method is called, it is the true method
that is invoked. The use of @system{ILU} in this program adds
some overhead in terms of included code, but has almost
the same performance as a version of this program that does not
use @system{ILU}.
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@calum.csclub.uwaterloo.ca Wed Nov 12 19:40:28 1997
From: papresco@calum.csclub.uwaterloo.ca (Paul Prescod)
Date: Wed, 12 Nov 1997 14:40:28 -0500 (EST)
Subject: [DOC-SIG] Documentation formats
In-Reply-To: <3469F4D1.5B5B@digicool.com> from "Jim Fulton" at Nov 12, 97 01:26:25 pm
Message-ID: <199711121940.OAA12494@calum.csclub.uwaterloo.ca>
> IMO in an ideal world, people would author documentation in a modern
> word processor like Frame or Word and people could share
> documentation files using some neutral format. I don't know if
> such a neutral format exists, although I seem to remember that at
> one point, Frame had a tool for working with SGML in Framemaker.
> I don't know what happened with that tool, but if it is still around,
> maybe people who hate editing SGML could use Frame or some other
> format
> that supports SGML and other folks could hack SGML or use tools that
> convert between their favorite editing environment and SGML.
That's right.
There are more tools for allowing you to create SGML documents without
typeing tags than there are for TeX, LaTeX and TIM. For a while I worked
in the source code bowels of one (extending it, not creating it). And
because SGML is an international standard, there are always more tools being
created that allow you do so.
Still, in the interest in truth in advertising, I should mention that in
my opinion, the idea that you will one day just create documents in a WYSIWYG
editor without worrying about the structure is a fantasy. Computers cannot
infer structure. The user interface must reflect the structure that you
want in your SGML files. If you want elements like "class", "method",
and "hyperlink", then you must be aware of their availability.
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From jim.fulton@digicool.com Wed Nov 12 20:02:32 1997
From: jim.fulton@digicool.com (Jim Fulton)
Date: Wed, 12 Nov 1997 15:02:32 -0500
Subject: [DOC-SIG] Documentation formats
References: <199711121940.OAA12494@calum.csclub.uwaterloo.ca>
Message-ID: <346A0B58.320E@digicool.com>
Paul Prescod wrote:
>
> Still, in the interest in truth in advertising, I should mention that in
> my opinion, the idea that you will one day just create documents in a WYSIWYG
> editor without worrying about the structure is a fantasy. Computers cannot
> infer structure. The user interface must reflect the structure that you
> want in your SGML files. If you want elements like "class", "method",
> and "hyperlink", then you must be aware of their availability.
Both Frame and Word let you create documents based on structural
elements.
So you can define and preserve structure while working in a WYSIWYG
environment.
--
Jim Fulton jim@digicool.com
Technical Director 540.371.6909 Python Powered!
Digital Creations http://www.digicool.com/ http://www.python.org/
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Wed Nov 12 22:27:19 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 12 Nov 1997 17:27:19 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
References: <3469D7C5.F90F32F9@technologist.com>
<199711121713.MAA28481@lemur.magnet.com> <199711121931.OAA24333@fermi.eeel.nist.gov>
Message-ID: <346A2D47.718C60DC@technologist.com>
Michael McLay wrote:
> Downside:
>
> 1) Heavy dependance on external programs which may not be on every platform
> MAKEINFO = '/usr/bin/makeinfo'
> TEX = '/usr/bin/tex'
> TEXINDEX = '/usr/bin/texindex'
> DVIPS = '/usr/bin/dvips'
>
> 2) May require some work to get the reference manual indexing
> working with the new tools.
> 3) Restricted set of tags, which makes it fairly hard to extend
> (except by using macros.)
> 4) Mixes macro language with markup. Is this really a problem?
> The TIM macros seem to primarily be used to declare context names
> which are then translatable to generic typographic codes. This
> should make it easier to move the tagged text to meaningful XML
> tags.
5) Mixes formatting ("@page, @noindent") with structure
("@codeexample")
6) Does not seem to allow restrictions on macro roles to be
expressed
7) There are no editors that will help you to create TIM documents
correctly (and will probably never be)
8) We will have to develop new output formats from scratch whereas
with SGML/Jade they are reused across an industry.
9) FrameMaker cannot import or export TIM, so we will have written
off a great WYSIWYG typesetting tool.
More important, to me: using TIM would generally contribute to the
"multiplication of documentation formats". The rest of the software
industry is about to rally around SGML's XML incarnation. Bill Gates
says its the greatest thing since sliced bread. Marc Andreeson agrees
with him (first time in history!). Adobe is also on board. I know many
cygnus people are interested in SGML and are working to move cygnus
tools over.
XML is not exactly what we want, but SGML is at least still in the same
ballpark -- the same parsers and other tools will usually support
either. The same DTDs can support both. Python software that we develop
to support SGML will be used across many different projects. Whereas
software to support TIM will probably be used for the library reference,
ILU and nothing else. Great SGML support in Python would actually
attract new users. I know I've turned some people onto Python via SGML
and so have others...there are even books on SGML that discuss Python.
One day there may be a book on SGML processing in Python.
In short, I think that subscribing to standards is the Right Thing
unless they are flawed. IMO, nobody has yet made a serious case against
SGML in this regard. SGML addressed everything that everyone has
complained about ("too verbose", "no tools", "too many delimiter
chars").
If we want, we can also define an SGML subset simple enough to be parsed
with Python tools alone. We will just take XML and add back in the
shortcuts we like from SGML and fix up sgmllib.py to support them. There
is no reason that SGML should be harder to parse than TIM if we restrict
ourselves to a subset. Really the only thing that's very hard about
generic SGML is automatic tag omission. If we forgo that (as TIM does)
then SGML is not really hard to parse.
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From janssen@parc.xerox.com Wed Nov 12 22:29:51 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 14:29:51 PST
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711121722.MAA01255@eric.CNRI.Reston.Va.US>
References: <346943C3.91CCF8FC@technologist.com> <199711121441.JAA00616@eric.CNRI.Reston.Va.US>
<3469D7C5.F90F32F9@technologist.com>
<199711121722.MAA01255@eric.CNRI.Reston.Va.US>
Message-ID:
Excerpts from ext.python: 12-Nov-97 Re: [DOC-SIG] Comparing SGM.. Guido
van Rossum@CNRI.Re (6473)
> TIM,
> which has only one magic character (@, which isn't used in Python)
> fits the bill -- it did one or two years when I looked into it, and
> it's only because of inertia (and a lot of other things that needed to
> happen sooner) that I haven't started using it.
Since then, I believe, the TIM front-end has been re-written in Python,
as well.
Bill
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From janssen@parc.xerox.com Wed Nov 12 22:33:20 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 14:33:20 PST
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <3469F09F.8DCBB65A@technologist.com>
References: <346943C3.91CCF8FC@technologist.com> <199711121441.JAA00616@eric.CNRI.Reston.Va.US>
<3469D7C5.F90F32F9@technologist.com> <199711121722.MAA01255@eric.CNRI.Reston.Va.US>
<3469F09F.8DCBB65A@technologist.com>
Message-ID:
Excerpts from ext.python: 12-Nov-97 Re: [DOC-SIG] Comparing SGM.. Paul
Prescod@technologis (6620*)
> How does TIM enforce the proper organization of document macros. Will it
> complain if I put an @messageDef{} inside of an @argDef{}? Doesn't this
> type of enforcement seem useful in a situation where many people around
> the world are working on a document?
It has no extra sanity checks for this; it uses whatever the Texinfo
checks are -- which isn't great, because Texinfo was originally a
collection of TeX hacks. TIM is clearly not as powerful as some
frameworks could be that were built in XML; however, it seems to be
powerful enough for a broad range of documents.
Bill
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From janssen@parc.xerox.com Wed Nov 12 22:35:36 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 14:35:36 PST
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To:
References:
Message-ID:
Excerpts from ext.python: 12-Nov-97 Re: [DOC-SIG] Comparing SGM.. David
Ascher@skivs.ski.o (1841*)
> I certainly didn't try to get the @node system working well, since I don't
> think "real" info use was going to happen.
Yes; this is currently an unaddressed major pain inherited from Texinfo.
I'm planning on (someday) adding automatic node generation based on
@section, etc.
Bill
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From janssen@parc.xerox.com Wed Nov 12 22:37:42 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 14:37:42 PST
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To:
References:
Message-ID:
Interested parties might also look at
http://www.parc.xerox.com/http-ng/architectural-model.html, which is
auto-generated from TIM source and illustrates the use of pictures and
URLs in TIM.
Bill
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From fredrik@pythonware.com Wed Nov 12 22:40:17 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 12 Nov 1997 23:40:17 +0100
Subject: [DOC-SIG] Comparing SGML DTDs
Message-ID: <9711122240.AA15058@arnold.image.ivab.se>
> There is no reason that SGML should be harder to parse than TIM if
> we restrict ourselves to a subset. Really the only thing that's very hard
> about generic SGML is automatic tag omission. If we forgo that (as
> TIM does) then SGML is not really hard to parse.
I'd say the most important issue here is whether it's hard to write
or not. I don't think so, but I haven't digged into any serious DTD
yet...
Has anyone looked at RTF<->SGML conversion? Guess that could
allow people to use Frame or Word (FWIW, I'm writing my book
in Word, with an RTF template created by Frame, and the resulting
files are converted to SGML by the ORA wizards... don't ask me
how they do it, though).
Or is the Emacs SGML mode good enough?
(On the other hand, I'm sure I'll have to pay for "voting against" the
benevolent dictator... and I've had enough flames in my mailbox
today ;-)
arrogantly-and-simple-minded-ly y'rs /F
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From janssen@parc.xerox.com Wed Nov 12 22:44:19 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 14:44:19 PST
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711121931.OAA24333@fermi.eeel.nist.gov>
References: <3469D7C5.F90F32F9@technologist.com>
<199711121713.MAA28481@lemur.magnet.com>
<199711121931.OAA24333@fermi.eeel.nist.gov>
Message-ID:
Just a few notes...
Excerpts from ext.python: 12-Nov-97 Re: [DOC-SIG] Comparing SGM..
Michael McLay@smtp.erols (21835*)
> 3) TIM was written in Python:-) (only about 820 lines of code)
TIM itself is just a macro front end to Texinfo that provides generic
markup, picture support, and URL support. That's what's written in
Python.
> 4) It looks like a markup that would be much easier to convert to
> XML than Latex. (My guess is that XML will eventually become the
> standard for WYSIWYG editors so the ugly tagging issue will go away.)
Yes. The current Perl script timdif2html provides HTML output; a
variant of that, or another Python script, would be used to produce XML.
> 1) Heavy dependance on external programs which may not be on every platform
> MAKEINFO = '/usr/bin/makeinfo'
> TEX = '/usr/bin/tex'
> TEXINDEX = '/usr/bin/texindex'
> DVIPS = '/usr/bin/dvips'
`makeinfo' and (I believe) `texindex' are part of the GNU Texinfo
package. TeX is freely available from Stanford (I think). `dvips' is a
commercial product used to convert TeX DVI to Postscript -- I'm not sure
if there's a freely available version.
> 3) Restricted set of tags, which makes it fairly hard to extend
> (except by using macros.)
You are restricted to the base tag set supported by the Texinfo tools.
However, arbitrary TIM renamings for these are available.
> 4) Mixes macro language with markup. Is this really a problem?
> The TIM macros seem to primarily be used to declare context names
> which are then translatable to generic typographic codes. This
> should make it easier to move the tagged text to meaningful XML
> tags.
That's correct. At some point a TIM parser should be written which
provides a parse tree that preserves the generic markup.
Bill
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From klm@python.org Wed Nov 12 17:27:54 1997
From: klm@python.org (Ken Manheimer)
Date: Wed, 12 Nov 1997 12:27:54 -0500 (EST)
Subject: [DOC-SIG] [XML] Notes on the Tutorial's markup
In-Reply-To: <199711111835.NAA02624@lemur.magnet.com>
Message-ID:
[Sorry - python.org sendmail got wedged, and reissued copies of one of
andrew kuchling's messages, having this message's subject line. It's
possible a few other messages got the same treatment, but it doesn't
look that way. In any case, sorry about the noise...]
Ken Manheimer klm@cnri.reston.va.us 703 620-8990 x268
(orporation for National Research |nitiatives
# Thanks for joining the PSA! #
# http://www.python.org/psa/ #
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From janssen@parc.xerox.com Wed Nov 12 21:47:51 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 13:47:51 PST
Subject: [DOC-SIG] comments on Python mapping
Message-ID: <97Nov12.144751pdt."404702"@watson.parc.xerox.com>
Martin, some comments on the Python mapping:
1) How about putting a version # on it so that we can keep track of
which version we're looking at?
2) Python keywords: an underscore suffix is valid OMG IDL, so it shouldn't
be used to discriminate keywords. The current ILU mapping uses an underscore
prefix on Python keywords.
3) Long double should use something like the thing in ILU, *not* be mapped
(with loss of information) to a Python floating point, unless that floating
point can in fact represent an OMG IDL long double.
4) char should be mapped as an integer, to be consistent with wchar.
5) I found the "fixed" example a bit confusing, because of the use of
"a" as a parameter in the first bulleted item. How about saying
"fixed", or some such?
6) Can't we just use "None" for NIL objects?
7) I'd like the "create_request" operation to take the repository ID of the
interface somehow, possibly as a keyword parameter. The CORBA notion of
just passing the method name is inherently broken.
8) The POA inheritance-based impl described seems to break one of the
most cherished parts of ILU, the ability to use true classes directly in
an application. Am I wrong? Also, I'd suggest M__POA, instead of POA_M.
9) Is it necessary to say, ``A class may implement multiple interfaces
only if those interfaces are in a strict inheritance relationship.''
Why do we care, so long as it implements the interfaces it claims to?
Bill
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From janssen@parc.xerox.com Wed Nov 12 23:02:54 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Wed, 12 Nov 1997 15:02:54 PST
Subject: [DOC-SIG] comments on Python mapping
In-Reply-To: <97Nov12.144751pdt."404702"@watson.parc.xerox.com>
References: <97Nov12.144751pdt."404702"@watson.parc.xerox.com>
Message-ID:
Ooops. Serves me right trying to do two conversations at once. I've
resent this to the DO-sig.
Bill
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Wed Nov 12 21:55:15 1997
From: papresco@technologist.com (Paul Prescod)
Date: Wed, 12 Nov 1997 16:55:15 -0500
Subject: [DOC-SIG] Documentation formats
References: <199711121940.OAA12494@calum.csclub.uwaterloo.ca> <346A0B58.320E@digicool.com>
Message-ID: <346A25C3.1106CAEE@technologist.com>
Jim Fulton wrote:
> Both Frame and Word let you create documents based on structural
> elements.
That's true, but those structural elements cannot nest and their
occurrences cannot be restricted (e.g. emph in emph).
> So you can define and preserve structure while working in a WYSIWYG
> environment.
I didn't dispute that. I pointed out that you must still think about
structure. In fact, you must think about it not just as much as you
would in an SGML editor, but more, because the editor gives you no help
in proper usage. This seems like the worst of all possible worlds to me.
More work, more thinking, no more freedom (which is what we usually
expect of a WYSIWYG editor).
The best of all possible worlds (for structured documentation) is a high
quality SGML-specific word processor. Frame+SGML comes close.
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@calum.csclub.uwaterloo.ca Thu Nov 13 03:12:15 1997
From: papresco@calum.csclub.uwaterloo.ca (Paul Prescod)
Date: Wed, 12 Nov 1997 22:12:15 -0500 (EST)
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <9711122240.AA15058@arnold.image.ivab.se> from "Fredrik Lundh" at Nov 12, 97 11:40:17 pm
Message-ID: <199711130312.WAA01130@calum.csclub.uwaterloo.ca>
> I'd say the most important issue here is whether it's hard to write
> or not. I don't think so, but I haven't digged into any serious DTD
> yet...
Don't try to pin that 'serious DTD' rap on SGML. :) If TIM is as structured
as anyone wants, then SGML can be as unstructured as TIM. It can also be
as tag minimized as TIM.
> Has anyone looked at RTF<->SGML conversion? Guess that could
> allow people to use Frame or Word (FWIW, I'm writing my book
> in Word, with an RTF template created by Frame, and the resulting
> files are converted to SGML by the ORA wizards... don't ask me
> how they do it, though).
You can convert RTF to SGML if you have no interest in taking advantage
of SGML's greatest feature. :) I've been trying to make this point but have
obviously not been having much success. SGML's greatest feature (which,
admittedly it shares with TIM) is that it allows you to develop new
abstractions and tag them. RTF (and Word) is an abstraction killer. Its most
sophisticated abstraction is the "paragraph". If we are going to start
from RTF then there is very little value in using SGML at any stage in the
process.
SGML's second greatest feature (which it does not share with TIM) is that it
is an International, and soon W3C standard with hundreds of tools and
tens of thousands of users and sites. I guess it is vaguely possible that
one of those tools will be useful with our "RTF-Demented SGML" but it isn't
likely. SGML is designed to be a source format, not a converted-to format.
But if you don't want to type tags at all, there is Frame+SGML and even
(uck) "SGML Author for Word". You still have to think about *structure* but
you don't have to type tags. In my experience, however, changing styles from
one to the other is no easier than typing tags. You can bind your styles
to a hotkey, but you can do the same with tags in Emacs.
> Or is the Emacs SGML mode good enough?
I think so. My only concern is that I have found it slow with huge DTDs on
slow machines. At home (P100, 32MB) it is quite good. With reasonably sized
DTDs (e.g. DocBook subsets) it is also quite good.
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Fred L. Drake, Jr."
References: <3469D7C5.F90F32F9@technologist.com>
<199711121713.MAA28481@lemur.magnet.com>
<199711121931.OAA24333@fermi.eeel.nist.gov>
<346A2D47.718C60DC@technologist.com>
Message-ID: <199711131513.KAA24509@weyr.cnri.reston.va.us>
Paul Prescod writes:
> In short, I think that subscribing to standards is the Right Thing
I concur.
> If we want, we can also define an SGML subset simple enough to be parsed
> with Python tools alone. We will just take XML and add back in the
> shortcuts we like from SGML and fix up sgmllib.py to support them. There
> is no reason that SGML should be harder to parse than TIM if we restrict
> ourselves to a subset. Really the only thing that's very hard about
> generic SGML is automatic tag omission. If we forgo that (as TIM does)
> then SGML is not really hard to parse.
The SGMLParser class from Grail is much better about SGML shortcuts
in "strict" mode (the non-strict mode is intended to support Web-style
HTML, i.e., invalid, and is not interesting for us). It supports
empty> end tags, and I think <>empty
start tags> are tolerably o.k., but I'm less convinced I understand
the correct behavior, and haven't had any time to really validate it
against SP.
I remember reading something that indicated the null end tags should
be discouraged. Can you fill us in on the SGML community's current
attitude on this? Does this only apply in the presence of SGML
editors like FM+SGML or should the avoidance also apply to manually
applied & revised markup?
-Fred
--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA 20191-5434
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Fred L. Drake, Jr."
References: <199711111835.NAA02624@lemur.magnet.com>
<3468D0BE.69C2052A@technologist.com>
<199711112204.RAA22015@weyr.cnri.reston.va.us>
<3469B8F9.32838B0@technologist.com>
Message-ID: <199711131614.LAA24610@weyr.cnri.reston.va.us>
Paul Prescod writes:
> I think we should tackle the tutorial first, as it will require less
> custom markup and programming.
As Guido pointed out, there are non-technical reasons not to mess
with this one. I think the primary advantages or SGML/XML come about
when dealing with something interesting like the Library Reference,
which offers a need to heavily structured data and substanstial
sections of prose.
> > Regarding processing, I'd have no problems using SP to do this; a
> > Python interface to the generic interface would not be difficult to
> > create, if a little tedious. I'm willing to do this, but it would be
> > evenings / weekends, and only if it'll get used.
>
> If you do this, I would strongly encourage you to skip the Generic
> Interface and move to the more poweful Grove Interface. On Windows, this
When I last looked at the interfaces, jade and the grove interface
were new. I'll take a look at the grove interface when I get a
chance; it does sound like it would be more useful.
> But anyhow, cool as the grove interface is, it isn't clear yet that we
> need any interface for this particular project. Hopefully we can depend
> on the existing tools (Jade and existing stylesheets). Once we want to
> go beyond their capabilities, we must decide whether to extend Python to
For static output there should be no need to use anything other than
jade & a collection of stylesheets. I was thinking more along the
lines of run-time access to the library reference, which could be used
to support interactive help systems and the like. It may be that
static generation of a more Python-friendly format would be
appropriate; perhaps a shelf as a repository for small documentation
objects that could then be rendered for display at runtime based on
user preferences or context.
-Fred
--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA 20191-5434
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Fred L. Drake, Jr."
References: <9711121628.AA05017@arnold.image.ivab.se>
<3469E3F8.8C4BAD63@technologist.com>
Message-ID: <199711131621.LAA24617@weyr.cnri.reston.va.us>
Paul Prescod writes:
> >From (La)TeX? Not really. Parsing LaTeX is not only difficult, but
> relatively undefined. There is no one language called "LaTeX" it is
> really a family of languages more or less defined by the Lamport book.
Python at one point (fairly recently) included a script that
converted LaTeX from the library reference to texinfo, so it's
actually not too painful as measured in development time, but would
need to be manually fixed up afterwards. I'd expect this to be a
one-time-only conversion, but a tool is probably the right way to do
it. The library reference is mostly pretty well structured.
> And lots of LaTeX documents mix generic structures and formatting
> interchangably. Finally, there is no easy way to figure out how to
> handle macros. Should they be expanded to their TeX primitives (uck)? If
> not, how do we know how to represent user defined macros in the target
> DTD? If you make a "foobar" macro, what do I do with it in SGML?
In the Python documentation, macro definitions are largely reserved
for the mystyle.sty file, and everything else uses those macros. So
it's much less adhoc than general LaTeX. I don't expect a general
tool for TeX->SGML can be developed in a finit time. I certainly have
no intention to try!
-Fred
--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA 20191-5434
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Fred L. Drake, Jr."
References: <346943C3.91CCF8FC@technologist.com>
<199711121441.JAA00616@eric.CNRI.Reston.Va.US>
<3469D7C5.F90F32F9@technologist.com>
<199711121722.MAA01255@eric.CNRI.Reston.Va.US>
Message-ID: <199711131629.LAA24629@weyr.cnri.reston.va.us>
Guido van Rossum writes:
> I just don't like the fact that SGML makes characters that occur
> frequently in Python source code like "<" and "/" special. Also the
As Paul pointed out, this is pretty bogus. The only sort of
conflict I can see which could cause legal Python code to be
intepreted as an SGML or XML construct would be something like this:
ok = ok&flag; print ok
^^^^^^
This is legal Python, but ugly as hell, and I don't think I've ever
seen the "&" operator used without spaces. So I'm not concerned.
> fact that SGML parsers that support the full syntax are either costly
> in money or in resources (few sites that I know have an SGML parser
Again, as Paul pointed out, SP and jade are free and substantially
cross platform as long as a solid C++ compiler is available. (gcc
counts.) If you're worried about having to install this stuff at
CNRI, know that jade 1.0 has been installed for a while. ;-) I think
1.1 is out; if so I'll upgrade our installation.
The tools are not unreasonable, they're just not written in Python.
-Fred
--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA 20191-5434
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From janssen@parc.xerox.com Thu Nov 13 21:14:41 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Thu, 13 Nov 1997 13:14:41 PST
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711130312.WAA01130@calum.csclub.uwaterloo.ca>
References: <199711130312.WAA01130@calum.csclub.uwaterloo.ca>
Message-ID:
Excerpts from ext.python: 12-Nov-97 Re: [DOC-SIG] Comparing SGM.. Paul
Prescod@calum.csclu (2299*)
> RTF (and Word) is an abstraction killer. Its most
> sophisticated abstraction is the "paragraph". If we are going to start
> from RTF then there is very little value in using SGML at any stage in the
> process.
I completely agree. RTF is the wrong direction. As is HTML or Texinfo,
for the same reason -- too concrete. TIM and XML are attempts at
removing that problem from Texinfo and HTML, respectively.
Bill
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From amk@magnet.com Thu Nov 13 21:30:49 1997
From: amk@magnet.com (Andrew Kuchling)
Date: Thu, 13 Nov 1997 16:30:49 -0500 (EST)
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: (message from Bill
Janssen on Thu, 13 Nov 1997 13:14:41 PST)
Message-ID: <199711132130.QAA11204@lemur.magnet.com>
All this discussion of SGML/XML is moot if we can't fix the basic
problem of SGML's special characters conflicting with Python's. Is
that problem solvable? Paul, you mentioned that it's possible to use
characters other than <> in SGML, but it's not commonly done. Why
not? What about XML? Would it be possible to write a DTD that looked
like TIM, instead of looking like HTML?
Andrew Kuchling
amk@magnet.com
http://starship.skyport.net/crew/amk/
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Fri Nov 14 01:08:29 1997
From: papresco@technologist.com (Paul Prescod)
Date: Thu, 13 Nov 1997 20:08:29 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
References: <199711132130.QAA11204@lemur.magnet.com>
Message-ID: <346BA48D.401C61AF@technologist.com>
Andrew Kuchling wrote:
>
> All this discussion of SGML/XML is moot if we can't fix the basic
> problem of SGML's special characters conflicting with Python's. Is
> that problem solvable? Paul, you mentioned that it's possible to use
> characters other than <> in SGML, but it's not commonly done. Why
> not? What about XML? Would it be possible to write a DTD that looked
> like TIM, instead of looking like HTML?
There is no basic problem. SGML/XML has two basic markup-staring
delimiters, "&" and "<". TIM has three, it seems "@", and "{}". As far
as I can tell, the TIM characters appear in Python docs just as often as
the SGML ones. Both languages have techniques for supressing markup
recognition ("escaping").
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Fri Nov 14 02:15:13 1997
From: papresco@technologist.com (Paul Prescod)
Date: Thu, 13 Nov 1997 21:15:13 -0500
Subject: [DOC-SIG] SGML for Python Documentation
Message-ID: <346BB431.52CA5300@technologist.com>
I spent the evening on a miniature version of the process of moving the
Python library docs into SGML. I think this will demonstrate SGML's
suitability to the task. The actual document I moved over is a subset of
the TIM document posted yesterday (using ILU and Python). In particular
I:
* Whipped up a mini-DTD that blended DocBook and the ILU special
abstractions ("metavar", "language", etc.)
* Encoded some of the TIM docs in my new SGML-based language
* Wrote a quickie mapping from the ILU abstractions to built-in DocBook
elements
* Ran the result through Norm Walsh's DocBook DSSSL stylesheets for
print and HTML
* Loaded the resulting RTF file into Word
* Made a PostScript file (warning -- Word PS files are "funny" -- I
wouldn't copy them directly to a PS printer if I were you)
All of this was done with free tools except for making the PostScript
file. Theoretically I could have also done that with Microsoft's free
"RTF Viewer" or with TeX. All of the source and result files are at:
http://itrc.uwaterloo.ca/~papresco/ilusgml.zip. All you need to run the
stylesheets is Jade, from http://www.jclark.com/jade. It compiles easily
on every platform I have tried.
I think that the resulting PostScript and HTML files are beautiful. If
there is some aspect that is not beautiful, we can upgrade the
stylesheets quite easily. The complete source is in the zip file on my
website.
I include the contents of the SGML in this email because people on the
list were curious what SGML/DocBook looks like. I could have actually
made the SGML much smaller, but I stuck to a very simple subset of SGML.
I suspect it is the exact subset that sgmllib.py can parse, so it is
already "Python compatible". Were I to use completely idiomatic SGML, I
could reduce the markup by quite a bit, but then sgmllib.py would not be
able to parse it anymore.
I believe that I have thus refuted the arguments that SGML is verbose,
too hard to parse, too expensive and otherwise not appropriate for the
task.
Paul Prescod
Using ILU with Python: A Tutorial>
Bill Janssen
1995 > Xerox Corporation>>
Introduction>
This tutorial will show how to use the ILU>
Each of the programs and files referenced in this tutorial is
available as a complete program in a separate appendix to this
document; parts of programs are quoted in the text of the tutorial.>
Specifying the Interface>
Our first task is to specify more exactly what it is we're
trying to provide. A typical four-function calculator lets a user
enter a value, then press an operation key, either +, -, /, or *, then
enter another number, then press = to actually have the operation
happen. There's usually a CLEAR button to press to reset the state of
the calculator. We want to provide something like that.>
We'll recast this a bit more formally as the
For example, we can think of the calculator as an object type,
with several methods: Add, Subtract, Multiply, Divide, Clear, etc.
The interface for our calculator would be written in ISL as:>
INTERFACE Tutorial;
EXCEPTION DivideByZero;
TYPE Calculator = OBJECT
METHODS
SetValue (v : REAL),
GetValue () : REAL,
Add (v : REAL),
Subtract (v : REAL),
Multiply (v : REAL),
Divide (v : REAL) RAISES DivideByZero END
END;
This defines an interface
The exception,
The object type,
Note also that
TYPE NotOurCalculator = OBJECT
METHODS
SetValue () : REAL,
Add (v : REAL) : REAL,
Subtract (v : REAL) : REAL,
Multiply (v : REAL) : REAL,
Divide (v : REAL) : REAL RAISES DivideByZero END
END;
-- but we didn't.>
Our list of methods on
Another standard feature of
CONSTANT Zero : INTEGER = 0;
>
Definitions, of interface, types, constants, and exceptions, are
terminated with a semicolon.
>
We should expand our interface a bit by adding more documentation
on what our methods actually do. We can do this with the
INTERFACE Tutorial;
EXCEPTION DivideByZero
"this error is signalled if the client of the Calculator calls
the Divide method with a value of 0";
TYPE Calculator = OBJECT
COLLECTIBLE
DOCUMENTATION "4-function calculator"
METHODS
SetValue (v : REAL) "Set the value of the calculator to `v'",
GetValue () : REAL "Return the value of the calculator",
Add (v : REAL) "Adds `v' to the calculator's value",
Subtract (v : REAL) "Subtracts `v' from the calculator's value",
Multiply (v : REAL) "Multiplies the calculator's value by `v'",
Divide (v : REAL) RAISES DivideByZero END
"Divides the calculator's value by `v'"
END;
Note that we can use the
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From janssen@parc.xerox.com Fri Nov 14 02:29:41 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Thu, 13 Nov 1997 18:29:41 PST
Subject: [DOC-SIG] SGML for Python Documentation
In-Reply-To: <346BB431.52CA5300@technologist.com>
References: <346BB431.52CA5300@technologist.com>
Message-ID:
Excerpts from ext.python: 13-Nov-97 [DOC-SIG] SGML for Python D.. Paul
Prescod@technologis (10796*)
> * Ran the result through Norm Walsh's DocBook DSSSL stylesheets for
> print and HTML
> * Loaded the resulting RTF file into Word
> * Made a PostScript file (warning -- Word PS files are "funny" -- I
> wouldn't copy them directly to a PS printer if I were you)
Yes, it's this kind of somewhat-defective tool chain that makes me
mistrust most current SGML-based solutions that I've seen.
My requirements:
- must be able to produce good plain text, Postscript or PDF, and
HTML versions of any document encoded in any new documentation
format;
- must be able to produce those automatically from the input, using
a script, not through any tools that require user interaction;
- tool chain must run on both UNIX and Windows
Unless the SGML tool chain satisfies those requirements, I'd keep looking.
Bill
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Fri Nov 14 02:48:01 1997
From: papresco@technologist.com (Paul Prescod)
Date: Thu, 13 Nov 1997 21:48:01 -0500
Subject: [DOC-SIG] Re: [PSA MEMBERS] [XML] Notes on the Tutorial's markup
References: <199711111835.NAA02624@lemur.magnet.com>
<3468D0BE.69C2052A@technologist.com>
<199711112204.RAA22015@weyr.cnri.reston.va.us>
<3469B8F9.32838B0@technologist.com> <199711131614.LAA24610@weyr.cnri.reston.va.us>
Message-ID: <346BBBE1.EC4BEB6B@technologist.com>
Fred L. Drake wrote:
> For static output there should be no need to use anything other than
> jade & a collection of stylesheets. I was thinking more along the
> lines of run-time access to the library reference, which could be used
> to support interactive help systems and the like. It may be that
> static generation of a more Python-friendly format would be
> appropriate; perhaps a shelf as a repository for small documentation
> objects that could then be rendered for display at runtime based on
> user preferences or context.
That would be cool. It would also be pretty trivial on Windows right
now. Getting a grove from SP takes like two lines of code:
import groveoa
groveoa.GroveBuilder().parse( "libref-ch3.sgm" )
I only mention this because one of my memes is that things should be as
easy on Unix as under Windows -- in other words there should be good
tools for making ILU bindings and people should write ILU bindings
instead of language-specific bindings whene ever possible.
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Fri Nov 14 02:55:37 1997
From: papresco@technologist.com (Paul Prescod)
Date: Thu, 13 Nov 1997 21:55:37 -0500
Subject: [DOC-SIG] Comparing SGML DTDs
References: <3469D7C5.F90F32F9@technologist.com>
<199711121713.MAA28481@lemur.magnet.com>
<199711121931.OAA24333@fermi.eeel.nist.gov>
<346A2D47.718C60DC@technologist.com> <199711131513.KAA24509@weyr.cnri.reston.va.us>
Message-ID: <346BBDA9.997D6D62@technologist.com>
Fred L. Drake wrote:
> The SGMLParser class from Grail is much better about SGML shortcuts
> in "strict" mode (the non-strict mode is intended to support Web-style
> HTML, i.e., invalid, and is not interesting for us). It supports
> empty> end tags, and I think <>empty
> start tags> are tolerably o.k., but I'm less convinced I understand
> the correct behavior, and haven't had any time to really validate it
> against SP.
I think it's easy to understand, but it's not a feature I use. We would
probably avoid it for our subset.
> I remember reading something that indicated the null end tags should
> be discouraged. Can you fill us in on the SGML community's current
> attitude on this? Does this only apply in the presence of SGML
> editors like FM+SGML or should the avoidance also apply to manually
> applied & revised markup?
I don't know of a problem with null end tags, but I very rarely use SGML
tools other than nsgmls, jade and emacs. Still, as you have pointed out,
they are very easy to implement. Eric Naggum was always the most picky
about proper markup and I don't remember him saying anything against
NET. I guess you could run into a problem with
> * Made a PostScript file (warning -- Word PS files are "funny" -- I
>wouldn't copy them directly to a PS printer if I were you)
Depends on how you configure your printer, really. Here's a
trick I'm using to get perfectly portable files:
1. install the QMS 810 driver (a good ole PostScript level 1 printer)
2. rename it as "Plain PostScript" or something, and route it
to the FILE device.
The files you get will print on virtually everything, especially
if you avoid non-standard fonts.
> I believe that I have thus refuted the arguments that SGML is verbose,
> too hard to parse, too expensive and otherwise not appropriate for the
> task.
I'm convinced! ;-)
Thanks /F
PS. To please users without access to either Word or Frame, is there
some way to go from RTF -> PostScript out there? Perhaps via groff?
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Fri Nov 14 14:27:42 1997
From: papresco@technologist.com (Paul Prescod)
Date: Fri, 14 Nov 1997 09:27:42 -0500
Subject: [DOC-SIG] SGML for Python Documentation
References: <01bcf0f4$df736760$6fadb4c1@fl-pc.image.ivab.se>
Message-ID: <346C5FDE.2C485F3B@technologist.com>
Fredrik Lundh wrote:
>
> PS. To please users without access to either Word or Frame, is there
> some way to go from RTF -> PostScript out there? Perhaps via groff?
RTF->Postscript is easy on Windows, with or without Word. I don't know
of a way to do it on other platforms.
I do know you can use the same DocBook stylesheet to go
SGML--[Jade]-->TeX--[TeX]-->dvi--[dvips]-->postscript, but you have to
install some TeX macro packages.
Oh yeah, and you can also go SGML--[Jade]-->TeX--[Texpdf]-->PDF
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Fri Nov 14 14:40:15 1997
From: papresco@technologist.com (Paul Prescod)
Date: Fri, 14 Nov 1997 09:40:15 -0500
Subject: [DOC-SIG] SGML for Python Documentation
References: <346BB431.52CA5300@technologist.com>
Message-ID: <346C62CF.1F2E3061@technologist.com>
Bill Janssen wrote:
> Yes, it's this kind of somewhat-defective tool chain that makes me
> mistrust most current SGML-based solutions that I've seen.
>
> My requirements:
>
> - must be able to produce good plain text, Postscript or PDF, and
> HTML versions of any document encoded in any new documentation
> format;
> - must be able to produce those automatically from the input, using
> a script, not through any tools that require user interaction;
> - tool chain must run on both UNIX and Windows
>
> Unless the SGML tool chain satisfies those requirements, I'd keep looking.
The SGML tool chain can satisify these requirements by going through TeX
instead of RTF. On my machine RTF is simpler because I don't have a
modern TeX installed. If TIM's output formats look especially
interesting for some particular project (texinfo for Emacs?) we could go
through sgmllib.py->TIM as well.
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Edward Welbourne Fri Nov 14 16:49:57 1997
From: Edward Welbourne (Edward Welbourne)
Date: Fri, 14 Nov 1997 16:49:57 GMT
Subject: [DOC-SIG] Comparing SGML DTDs
In-Reply-To: <199711132130.QAA11204@lemur.magnet.com>
References:
<199711132130.QAA11204@lemur.magnet.com>
Message-ID: <9711141649.AA29880@lslr6g.lsl.co.uk>
> All this discussion of SGML/XML is moot if we can't fix the basic
> problem of SGML's special characters conflicting with Python's.
In the main pages of documentation, the manuals &c., we don't have to
worry about all this because we can use fancy tools to do our document
generation for us, so the fact that some folk find typing raw SGML
tedious doesn't raise itself as an objection. Use of Frame+SGML,
(modern) emacs SGML mode or one of the better WYSI-more-or-less-WYG SGML
editors will suffice. The only place where there's a problem with SGML
is in the labour of writing and the ugliness of reading doc strings.
In a doc string, the only special characters are \, % and the quote
character used to delimit the string (right ?). Furthermore,
doc-strings are triple-quoted, so quote characters only matter if they
happen in triplicate: which doesn't happen in HTML. I don't see HTML
using \. And % only presents a problem if we want to use doc-strings as
format strings: I don't.
The fact that outside strings python has special readings of <,
>, &, / and ; is not an issue: we only intend to write our
XML/TIM/... inside doc strings or in files which aren't python code.
I see no conflict in need of a fix here.
Enlighten me if I've missed the point.
Eddy.
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From tcsender@get-more-hits.com Wed Nov 12 11:00:28 1997
From: tcsender@get-more-hits.com (tcsender@get-more-hits.com)
Date: Wed, 12 Nov 97 06:00:28 EST
Subject: [DOC-SIG] Put Your Site at the TOP of the Search Engines !
Message-ID: <637182719127.tcsender@get-more-hits.com>
11/15/97
Dear Friend and Fellow Entrepreneur,
DISCOVER The Most Powerful & PROVEN Strategies that Really Work To Place
You At The Top of the Search Engines!
You will only receive this offer once.
If you have a web page, or site, that can't be found at the top of the
search engines, then this will be the most important information you will
ever read. You are about to Discover the most Powerful Strategies used
only by the very best on the Web... strategies so Powerful that once used
will place your Web Page or site at the TOP 10 - 20 search engine listings!
These TOP SECRET strategies will provide you with a cutting edge advantage
over your competition and give you the long awaited results you have been
looking for. Just imagine opening a Floodgate of People into your Home Page
because you have the right information. It doesn't matter if you have one
page or 1000 pages--you can achieve a top rating with this powerful
information and soon squash your competition!
This 25 page in-depth report covers:
>Search Engine Tactics your competition doesn't want you to know!
>The best kept secrets to getting you a top 10 - 20 listing! The 10 top
keywords searched for!· Getting better positioning than your opposition
even when they have the same identical keywords!
>Proven techniques for selecting the most effective keywords and how to
arrange them!
>A powerful way to get your listing seen by potential customers, even if
they're not looking for you!
>A little-known way to get multiple listings for your site in the same
search engine!
>Proven strategies used to resubmit your page or site and get that top
rating even if you have it listed already!
>How to get people to go to your site first even if they see your
competition!
>The most powerful words used to create the best Web Pages!
>A Web tool used to market successfully in the Newsgroups!
>Five things you should NEVER do!
If you aren't at the top of the search engines now... your competition is!
It's estimated that over 1000 new Web Pages are coming online every day!
Newspapers are reporting over 14,000 new www addresses are being submitted
every week. The competition grows every minute! It just makes sense that
those who know and apply this information will definitely have the best
chance of realizing their dreams of success.
This in-depth report is normally US$49.95... However, if you order within
the next 10 days... we'll include ABSOLUTELY FREE... OVER 1000 Links where
you can advertise your web site FREE and you can have it all for JUST
US$19.95! This INVALUABLE information alone is worth the asking price!
Don't delay... this Extraordinary and Valuable Information can be yours
today for ONLY $19.95 (USA FUNDS). Why Wait... Order Right Now!
As an added BONUS, if you respond within 10 days:
You'll also receive free tools, images, and tips to help you with Your Web
Page construction, including free CGI scripts, buttons, backgrounds, and
loads of Jpegs and Gifs, including animated Gifs!
Please print, cut, and fill out the following order coupon:
----------------------------------------------------------------------------
ATTN: Please type or print legibly to ensure timely delivery.
Name
__________________________________________________________
!E-mail Address (Required)
__________________________________________________________
Address
__________________________________________________________
City ______________________ State ________ Zip ___________
Country______________________
Phone #______________________
$19.95 SEARCH ENGINE SECRETS (US Dollars)
$_____ Sales Tax (MA residents 5.00%)
$_____ Order Total
PAYMENT BY:
___ Personal/Business Check ___ Money Order ___ Cashiers Check-US FUNDS only!
PREFERRED FORMAT (Please check one or more of the following):
___ ASCII ___ Word 2.x for Windows ___Word 6.x ___ Word 7.x ___ Zipped
>>If you're ordering from outside the USA, only a Money Order in US Dollars
will be accepted. No postal delivery is available outside the USA, so you
must include your E-mail address accurately and legibly. If you do not
currently have an E-mail address, please get permission to use a friend's. <<
Discount expires 11-25-97.
--------------------------------------------------------------------------------------
For fastest service use Cashiers Check or Money Order.
Please include your e-mail address for 24 hour order processing. Please
allow 2 weeks for processing by regular postal mail.
Please make payable to -> EVA, Inc.
and send to:
EVA, Inc.
43 Riverside Ave.
Suite 72
Medford, MA 02155
USA
Reminder: Your order must be postmarked by Tuesday, November 25th in order
to receive the bonuses.
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From guido@CNRI.Reston.Va.US Sat Nov 15 17:50:03 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Sat, 15 Nov 1997 12:50:03 -0500
Subject: [DOC-SIG] Doc strings debate
Message-ID: <199711151750.MAA19469@eric.CNRI.Reston.Va.US>
There seem to be a number of different questions here; I'll try to
discuss them separately. This message pertains to doc strings. A
separate message will discuss the library reference manual.
As I see it, it's up to each project to decide what to do about doc
strings. Some choices are:
- no doc strings
- terse text only doc strings, for quick on-line reminders only
- longer text only doc strings, mostly for reference by the programmer
who is reading the source code (i.e. doc strings are just syntactic
sugar for comments)
- longer doc strings with some markup (e.g. stext or a very limited
HTML subset) that could be used to generate on-line documentation and
printed documentation
- full "literate programming" doc strings, with elaborate markup; a
preprocessor may be needed to extract Python source with smaller doc
strings
The choice depends on the goals of the project as well as on the
availability of tools. There seem to be some tools but they all seem
to have some shortcomings. The debate on what the tools should do is
endless; one of the reasons is that the project goals differ.
In my own style of working, I prefer terse or longer text-only doc
strings, since I am not interested in generating printed documentation
from the doc strings. This means I don't have much use for tools (the
only tool that makes sense would be some kind of class browser that
has good support for displaying doc strings). I don't think that the
availability of other tools would affect my style of working much; but
I realize it's a personal choice and I don't want to impose it on the
Python community as the only way to use doc strings. However, I will
continue to use this style for the standard Python library. Given
that I am doing most of the work here I think I have that prerogative.
I hope tools become available that give the authors of other projects
more choice -- as it stands, if gendoc doesn't do what you want,
you're basically forced to write your own tools. Some of the existing
"hacks" deserve to be refined into more generally useful tools. The
doc sig can contribute here by discussing the requirements for a
number of different tools, for projects with different ambitions.
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Daniel.Larsson@vasteras.mail.telia.com Sat Nov 15 18:47:58 1997
From: Daniel.Larsson@vasteras.mail.telia.com (Daniel Larsson)
Date: Sat, 15 Nov 1997 19:47:58 +0100
Subject: [DOC-SIG] Doc strings debate
Message-ID: <01bcf1f7$0e5ca520$25bc43c3@Daniel.telia.com>
This is a multi-part message in MIME format.
------=_NextPart_000_0014_01BCF1FF.70210D20
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
-----Original Message-----
From: Guido van Rossum
To: doc-sig@python.org
Date: den 15 november 1997 19:29
Subject: [DOC-SIG] Doc strings debate
>There seem to be a number of different questions here; I'll try to
>discuss them separately. This message pertains to doc strings. A
>separate message will discuss the library reference manual.
>
>As I see it, it's up to each project to decide what to do about doc
>strings. Some choices are:
>
>- no doc strings
>
>- terse text only doc strings, for quick on-line reminders only
>
>- longer text only doc strings, mostly for reference by the programmer
> who is reading the source code (i.e. doc strings are just syntactic
> sugar for comments)
>
>- longer doc strings with some markup (e.g. stext or a very limited
> HTML subset) that could be used to generate on-line documentation and
> printed documentation
>
>- full "literate programming" doc strings, with elaborate markup; a
> preprocessor may be needed to extract Python source with smaller doc
> strings
>
>The choice depends on the goals of the project as well as on the
>availability of tools. There seem to be some tools but they all seem
>to have some shortcomings. The debate on what the tools should do is
>endless; one of the reasons is that the project goals differ.
>
>In my own style of working, I prefer terse or longer text-only doc
>strings, since I am not interested in generating printed documentation
>from the doc strings. This means I don't have much use for tools (the
>only tool that makes sense would be some kind of class browser that
>has good support for displaying doc strings). I don't think that the
>availability of other tools would affect my style of working much; but
>I realize it's a personal choice and I don't want to impose it on the
>Python community as the only way to use doc strings. However, I will
>continue to use this style for the standard Python library. Given
>that I am doing most of the work here I think I have that prerogative.
>
>I hope tools become available that give the authors of other projects
>more choice -- as it stands, if gendoc doesn't do what you want,
>you're basically forced to write your own tools.
There is actually one other action you might want to consider if gendoc
doesn't
fit your needs: Propose what it should do and perhaps we can evolve gendoc
towards that.
What we could do is to figure out an API for extracting information out of
docstrings
which we can use for different kind of tools, such as doc generating tools
(I don't
like writing reference manuals separately, so I want tools that generate
printed
documents), class browsers, etc.
>Some of the existing
>"hacks" deserve to be refined into more generally useful tools. The
>doc sig can contribute here by discussing the requirements for a
>number of different tools, for projects with different ambitions.
>
>--Guido van Rossum (home page: http://www.python.org/~guido/)
>
>_______________
>DOC-SIG - SIG for the Python Documentation Project
>
>send messages to: doc-sig@python.org
>administrivia to: doc-sig-request@python.org
>_______________
>
------=_NextPart_000_0014_01BCF1FF.70210D20
Content-Type: text/x-vcard;
name="Daniel Larsson.vcf"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
filename="Daniel Larsson.vcf"
BEGIN:VCARD
N:Larsson;Daniel
FN:Daniel Larsson
ORG:ABB Industrial Systems AB;LKD
TITLE:Software Engineer
ADR;WORK:;;;V=E4ster=E5s;;;Sweden
LABEL;WORK;ENCODING=3DQUOTED-PRINTABLE:V=3DE4ster=3DE5s=3D0D=3D0ASweden
URL:http://starship.skyport.net/crew/danilo
EMAIL;PREF;INTERNET:Daniel.Larsson@vasteras.mail.telia.com
END:VCARD
------=_NextPart_000_0014_01BCF1FF.70210D20--
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From guido@CNRI.Reston.Va.US Sat Nov 15 20:30:49 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Sat, 15 Nov 1997 15:30:49 -0500
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>
Some SGML extremists have started lobbying for SGML or XML, which has
brought up quite a religious debate (maybe started by my remark that
SGML is not fit for humans to type :-). I feel that we're not getting
anywhere unless we face some of the facts, so here's a reality check
followed by some opinions.
I hope I've moved the doc string discussion to a separate thread. I
don't think the library manual should be tied in with doc strings in
any way, so it can be discussed separately.
The first problem is that the library manual is currently done in
LaTeX. I would guess that 99% of the markup is structural -- the only
places where physical markup is used in a significant way is in the
use of 'strong' and 'emphasis' to mean a number of different things
(e.g. warnings, notes, implementation restrictions, etc.). There are
a few places where physical markup is used to overcome some formatting
weirdnesses, but I've always tried to keep these to a minimum.
Any proposed solution that doesn't take into account how to convert
the existing library manual is a trivial reject.
I see a number of problems with the use of LaTeX -- but the fact that
"it's not SGML" is not one of them. Perhaps the biggest problem is
that LaTeX and TeX are losing popularity. TeX may still be the
standard for respectable and somewhat conservative publications like
the Astrophysical Journal, but most publishers nowadays are just as
happy to accept MS Word or other popular wordprocessors.
I would say that the one remaining reason to use TeX or LaTeX for some
groups is that TeX does mathematics better than anything else; however
that's not relevant for the Python community. From experience, I
would say that LaTeX does computer documentation rather poorly
(witness the many hacks in the myformat.sty file), and I haven't even
dealt properly with optional or keyword arguments, let alone classes
and methods and inheritance.
The decreasing popularity of LaTeX is a problem because it means that
potential contributors are discouraged -- many simply don't know
LaTeX, and even those that do know it may not have access to an
implementation any more. Installing LaTeX is a major undertaking, and
one is less and less likely to find installations that already have it
installed, outside central Unix servers at academic institutions. (I
did a web search on LaTeX for Windows 95; one of the pages,
http://www2.eece.maine.edu/~dprice/Latex/latex.htm, which seems to
have a lot of useful info, leaves me with the impression that one
needs to be *very* motivated to bring this to a good end. It ends
with the admonition "Good Luck! You're gonna need it...")
Another problem, caused by this, is that there are few LaTeX hackers
around who can help with the creation of new macros (e.g. for keyword
arguments).
On the plus side, there is truth in the old saying "don't fix it if it
ain't broken." I personally have access to a working LaTeX
installation, the latex2html converter produces adequate HTML (I still
need to work on the translation for a few of the environments
introduced by myformat.sty, but that shouldn't be too hard), and I
haven't heard too many complaints yet from people who would like to
contribute documentation but don't know LaTeX -- they pick it up
pretty easily from the template I provide.
*** The real problem seems how to get people to contribute at all! ***
If using SGML or XML would make more people eager to contribute, I
might be convinced; but somehow I doubt it. At the moment, both the
learning curve and the installation effort for SGML or XML tools
appears to be still steeper than for LaTeX.
There has been some debate on SGML vs. XML. It seems that SGML can be
made easy to type, at the cost of making it much harder to parse
correctly. XML appears to be designed mostly as a transport format
(one page with XML info I found made the explicit point that being
easy to type was *not* a design criterium). Anyway, once a decision
to use either is made, conversion between the two is probably easy,
especially since XML is a true subset of SGML.
Finally, TIM has been brought up. It's a bit easier to type and more
pleasing to my eye than shorthand SGML (e.g. SGML whatever>
vs. TIM @title{whatever}) and it's a lot easier to parse. It uses
structural markup and has a simple macro language to add new
structural elements. This makes it relatively easy to convert to
SGML, as long as the TIM authors adhere to reasonable structuring
constraints (i.e. don't abuse constructs for different purposes).
TIM's primary weakness at the moment seems to be its toolchain, which
starts good (the parser it written in Python) but quickly runs into
problems on non-Unix platforms: for HTML generation it uses a Perl4
script, and for PostScript it goes through texinfo and hence through
tex. For Unix, TIM's toolchain is perfect, however, and I like the
simplicity of its approach -- it should be simple enough to rewrite
the TIM-to-HTML converter in Python (maybe using HTMLgen?).
For Windows, it just *may* be possible that Word 97 will actually
parse the HTML generated by TIM so as to make it possible to generate
Postscript on Windows platforms with commonly available tools; in any
case, a prospective TIM author on a Windows platform would only need
the HTML generating part of the toolchain for on-screen previewing.
I'd love this discussion to come to an end. I think that we would be
in good shape with TIM, *if* we solve two outstanding problems. One
should be easy: rewriting the TIM-to-HTML tool in Python.
The other one is much hairier: conversion of the existing LaTeX source
to TIM! This needs to be a high quality conversion, e.g. ideally it
should maintain comments and other aspects of source formatting
(like line breaks) that don't affect the generated pages but does
affect the human reader of the source, because the output of the
conversion will be edited manually henceforth. On the other hand,
this only needs to be done once, so a small amount of manual tweaking
is acceptable. The old conversion script (partparse.py) which I still
have laying around somewhere is probably able to do this with some
small changes (I sure hope those changes are small, because this is
one horrible piece of code... good for a one-off job though).
Those who want SGML or XGML should be able to convert TIM to their
favorite DTD using a different back end for the TIM front end. I
would love specific feedback on the structural capabilities of TIM;
ideally, TIM should map directly onto a real SGML DTD as far as
document structure is concerned. However, I don't want to compromise
TIM to make it possible to parse it with a generic SGML scanner; the
efforts to move HTML towards strict SGML scanner compatibility have
taught me a valuable lesson.
One final note: I looked at Perl's POD (Plain Old Documentation) for a
few seconds. It's more limited than TIM and uses physical markup
(e.g. B), but has one feature that I like: a block of
indented text offset by blank lines (I believe) is automatically
interpreted as a code sample block (verbatim in LaTeX terms,
@codeexample in TIM). This makes POD source remarkably readable. I
presume that it would be trivial to add this to the TIM front-end. (I
particularly like this idea because it's the same convention that I
used in the Python FAQ wizard. :-)
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Sun Nov 16 01:06:30 1997
From: papresco@technologist.com (Paul Prescod)
Date: Sat, 15 Nov 1997 20:06:30 -0500
Subject: [DOC-SIG] Library reference manual debate
References: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>
Message-ID: <346E4716.E06699EC@technologist.com>
Guido van Rossum wrote:
>
> Any proposed solution that doesn't take into account how to convert
> the existing library manual is a trivial reject.
Although this is definately important, I don't see how that would argue
in favour of one solution or another. The output format of such a
process seems easy to handle -- it's the input that will give us a
headache.
> I see a number of problems with the use of LaTeX -- but the fact that
> "it's not SGML" is not one of them. Perhaps the biggest problem is
> that LaTeX and TeX are losing popularity.
? Popularity is a problem for TeX, but TIM, which has no popularity to
speak of and is *built on top of* TeX does not have the same problem?
> TeX may still be the
> standard for respectable and somewhat conservative publications like
> the Astrophysical Journal, but most publishers nowadays are just as
> happy to accept MS Word or other popular wordprocessors.
Right, and SGML can create those trivially. TIM cannot. I am currently
writing a book where I will give them a MIF file which they will
beautify for me.
> (I
> did a web search on LaTeX for Windows 95; one of the pages,
> http://www2.eece.maine.edu/~dprice/Latex/latex.htm, which seems to
> have a lot of useful info, leaves me with the impression that one
> needs to be *very* motivated to bring this to a good end. It ends
> with the admonition "Good Luck! You're gonna need it...")
I'm not advocating that we stay with LaTeX, but I found MIKTEX to be
quite easy to install. It doesn't seem to come with all of the latest
and greatest LaTeX add-ons, but otherwise it is quite good and comes
with a good installation program.
> At the moment, both the
> learning curve and the installation effort for SGML or XML tools
> appears to be still steeper than for LaTeX.
I don't think there is anything confusing...especially if you are using
windows. Here are the steps:
Jade installation:
1. Download Jade binary or source
2. Unzip
3. Type "make" if you downloaded the source.
4. Copy binaries to some directory in your path
Python-doc package installation:
1. Download pythondoc zip file.
2. Unzip
The python-doc package contains stylesheets, sample chapters, chapter
template and DTD.
Python-doc chapter creation:
cp chapter-template.sgm my-chapter.sgm
vi my-chapter.sgm
Jade use:
jade -t tex -d style/pythondoc2print my-chapter.sgm
tex &jadetex my-chapter.tex
dvips my-chapter.dvi
OR
jade -t rtf -d style/pythondoc2print my-chapter.sgm
winword my-chapter.rtf
OR
jade -t mif -d style/pythondoc2print my-chapter.sgm
frame my-chapter.mif
OR
jade -t sgml -d style/pythondoc2html my-chapter.sgm
lynx my-chapter.html
netscape my-chapter.html
As far as truth in advertising, I should point out that if your LaTeX is
out of date, you will need to download a few style and font files here
and there to use the JadeTex package. That's why things are not QUITE as
easy on Unix as on Windows (but isn't that always the case?).
> There has been some debate on SGML vs. XML. It seems that SGML can be
> made easy to type, at the cost of making it much harder to parse
> correctly.
I don't see any evidence of that. If we stick to the conventions
supported by sgmllib.py, then SGML is as easy to parse as TIM, and we
already have the parser implemented. That parser is only 400 lines of
code (including test harness) and seems to handle the file I emailed a
few days ago perfectly.
> Finally, TIM has been brought up. It's a bit easier to type and more
> pleasing to my eye than shorthand SGML (e.g. SGML whatever>
> vs. TIM @title{whatever}) and it's a lot easier to parse.
I dunno, I consider TIM's primary weakness at the moment seems to be its toolchain, which
> starts good (the parser it written in Python) but quickly runs into
> problems on non-Unix platforms: for HTML generation it uses a Perl4
> script, and for PostScript it goes through texinfo and hence through
> tex. For Unix, TIM's toolchain is perfect, however, and I like the
> simplicity of its approach -- it should be simple enough to rewrite
> the TIM-to-HTML converter in Python (maybe using HTMLgen?).
So we would rather rewrite this rather than using the existing HTML
converters for DocBook? Despite the fact that the DocBook converters
already handle print properly on Windows (RTF/TeX/Frame) and
Unix(Tex/Frame)?
> For Windows, it just *may* be possible that Word 97 will actually
> parse the HTML generated by TIM so as to make it possible to generate
> Postscript on Windows platforms with commonly available tools; in any
> case, a prospective TIM author on a Windows platform would only need
> the HTML generating part of the toolchain for on-screen previewing.
Printing HTML documents seems like a solution of last resort.
> However, I don't want to compromise
> TIM to make it possible to parse it with a generic SGML scanner; the
> efforts to move HTML towards strict SGML scanner compatibility have
> taught me a valuable lesson.
I'm not sure what you mean by this. HTML has been SGML compatible since
version 2.0. As far as I know, it lost no useful features in the
changeover.
I can't dispute the point that SGML doesn't look nice to your eyes.
Beauty is in the eye of the beholder. That might be enough to tip the
balance in TIM's favour, but I still feel it is my responsibility to
point out that SGML is not hard to type, need not be hard to parse and
does not require difficult or expensive tools to use. If those
complaints are going to be factors in the decision then they should be
substantiated.
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From cjr@bound.xs4all.nl Sun Nov 16 01:33:39 1997
From: cjr@bound.xs4all.nl (Case Roole)
Date: Sun, 16 Nov 1997 01:33:39 +0000 (WET)
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711152030.PAA19793@eric.CNRI.Reston.Va.US> from "Guido van Rossum" at Nov 15, 97 03:30:49 pm
Message-ID: <199711160133.BAA22071@axiom.bound.xs4all.nl>
Guido wrote:
> One final note: I looked at Perl's POD (Plain Old Documentation) for a
> few seconds. It's more limited than TIM and uses physical markup
> (e.g. B), but has one feature that I like: a block of
> indented text offset by blank lines (I believe) is automatically
> interpreted as a code sample block (verbatim in LaTeX terms,
> @codeexample in TIM). This makes POD source remarkably readable. I
> presume that it would be trivial to add this to the TIM front-end. (I
> particularly like this idea because it's the same convention that I
> used in the Python FAQ wizard. :-)
Just wondering: for HTML generation I use "megatags", non-HTML tags in
documents that are otherwise HTML. An SGML parser (derived from the one
in sgmllib) lets pure HTML pass, but fetches and processes the data
embedded in these megatags (example below). This is decidedly not pure
SGML or pure HTML, but the *code is extremely readable*. Is this what
everybody is using the SGMLParser for, is it irrelevant for the matter
discussed here, or is this a good idea?
cjr
------------------------------------------------------------
Example:
For my curriculum vitea I wanted a list of labels and values. It seemed
best to me to use a table with labels represented as table headers and
values as table descriptions. Both are aligned to the center of the table
which is a default that can be changed by setting the attributes
'left_align' and 'right_align'. Attributes of the table can be set by using,
e.g. 'table_border'. (This approach is entirely derived from Pmw's naming
conventions.)
Here is what I write:
naam = Cornelis Jan Roele
email =
geboortedatum = 9 januari 1967
geboorteplaats = Doetinchem,
straat en nummer = Spitsbergenstraat 67
postcode/woonplaats = 1013 CL AMSTERDAM
telefoon = 020-684.62.95
NB the mailtag is another "megatag".
And this is what the parser generates:
naam:
Cornelis Jan Roele
email:
<cjr@bound.xs4all.nl>
geboortedatum:
9 januari 1967
geboorteplaats:
Doetinchem'),
straat en nummer:
Spitsbergenstraat 67
postcode/woonplaats:
1013 CL AMSTERDAM
telefoon:
020-684.62.95
--
Case Roole
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Fred L. Drake, Jr."
References: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>
Message-ID: <199711160451.XAA29095@weyr.cnri.reston.va.us>
Guido van Rossum writes:
> Some SGML extremists have started lobbying for SGML or XML, which has
Ouch!
> The first problem is that the library manual is currently done in
> LaTeX. I would guess that 99% of the markup is structural -- the only
I don't see that this will be too difficult a conversion, actually,
primarily due to the care with which most of the markup was
performed.
> I see a number of problems with the use of LaTeX -- but the fact that
> "it's not SGML" is not one of them. Perhaps the biggest problem is
Agreed; that's not a relevant issue.
> would say that LaTeX does computer documentation rather poorly
> (witness the many hacks in the myformat.sty file), and I haven't even
> dealt properly with optional or keyword arguments, let alone classes
LaTeX isn't designed for it; it's supposed to be much more general
than that. However, as myformat,sty shows, the appropriate markup can
be created. With a bit more work, the remaining semantic constructs
can be created, but they may contain a moderate amount of formatting
within the macros. I think this is the most substantial problem with
a TeX-based solution; myformat.sty has to be completely rewritten to
change the output; the markup is not defined separately from the
processing (formatting) steps.
> The decreasing popularity of LaTeX is a problem because it means that
> potential contributors are discouraged -- many simply don't know
This may have some pertinence, but the relevance is small; it's
still better than most if not all of the more popular systems.
SGML/XML is better due to the separation of semantic relations from
the processing specifications.
> LaTeX, and even those that do know it may not have access to an
> implementation any more. Installing LaTeX is a major undertaking, and
Agreed, outside some Linux distributions, LaTeX is probably a pain
unless you're willing to spring for a commercial version. There are a
few for PCs, but I've not followed them.
> Another problem, caused by this, is that there are few LaTeX hackers
> around who can help with the creation of new macros (e.g. for keyword
> arguments).
If that's the problem and no superior solution can be agreed upon, I
can help with that.
> ain't broken." I personally have access to a working LaTeX
> installation, the latex2html converter produces adequate HTML (I still
It's out of date and should be updated, but does work for the Python
documentation. I have found very reasonable LaTeX2e documents that
can't be formatted correctly using the CNRI installation.
> There has been some debate on SGML vs. XML. It seems that SGML can be
> made easy to type, at the cost of making it much harder to parse
> correctly. XML appears to be designed mostly as a transport format
I've seen nothing to indicate that SGML is more difficult to parse
correctly in any reasonable interpretation, and if the examples Paul
presented on shortcuts are what yo're refering to, the work's already
been done in Grail's SGMLParser module.
> structural markup and has a simple macro language to add new
> structural elements. This makes it relatively easy to convert to
The last time I looked at it, the only "structural" elements which
could be added were alternate names for the character-styling controls
(bold, italic, etc.) and not for larger structural components. Has
this changed? It might have.
> Those who want SGML or XGML should be able to convert TIM to their
This is not an interesting issue; if you choose to use TIM for the
authoritative version, there should be no reason to convert to
SGML/XML except to have access to powerful formatting tools (DSSSL and
jade in particular).
It sounds as if you're convinced that there should be either no
change or conversion to TIM. If this is the case and you won't
consider other alternatives seriously, please just say so. I think
all of us advocating other approaches are doing to in good conscious
and not just to waste bandwidth. If we're wasting time, we can find
more enjoyable ways to do so.
If, on the other hand, there's a real possibility of switching to at
all, the advocates / experts for each format / technology / whatever
you want to call it should start to develop sample processes and
converted segments of the Python documentation to allow all of us to
see the behavior of each in practice. I don't think many of us have
actually tried to use all of the techniques being discussed. This may
be more productive than just shouting "Use FOOsplatter!" at each
other. But we do need to know that we're not wasting our time at it
before we bother.
-Fred
--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA 20191-5434
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Sun Nov 16 05:11:57 1997
From: papresco@technologist.com (Paul Prescod)
Date: Sun, 16 Nov 1997 00:11:57 -0500
Subject: [DOC-SIG] Library reference manual debate
References: <199711160133.BAA22071@axiom.bound.xs4all.nl>
Message-ID: <346E809D.D48F5E4F@technologist.com>
Case Roole wrote:
> Just wondering: for HTML generation I use "megatags", non-HTML tags in
> documents that are otherwise HTML. An SGML parser (derived from the one
> in sgmllib) lets pure HTML pass, but fetches and processes the data
> embedded in these megatags (example below). This is decidedly not pure
> SGML or pure HTML, but the *code is extremely readable*. Is this what
> everybody is using the SGMLParser for, is it irrelevant for the matter
> discussed here, or is this a good idea?
SGML was explicitly designed to allow this and has features to do this
sort of thing for you. A full SGML parser can interpret your "=" symbol
and even your newlines as tags. This is very convenient for typists. I
think that for novice users it will probably be quite confusing,
however, because people are used to all SGML markup being in clearly
marked tags, not in ordinary-looking characters. Also to parse something
like this in Python we would either have to complicate sgmllib or
introduce another layer of parsing.
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From cjr@bound.xs4all.nl Sun Nov 16 10:08:13 1997
From: cjr@bound.xs4all.nl (Case Roole)
Date: Sun, 16 Nov 1997 10:08:13 +0000 (WET)
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <346E809D.D48F5E4F@technologist.com> from "Paul Prescod" at Nov 16, 97 00:11:57 am
Message-ID: <199711161008.KAA02434@axiom.bound.xs4all.nl>
Paul Prescod wrote:
>
> Case Roole wrote:
> > Just wondering: for HTML generation I use "megatags", non-HTML tags in
> > documents that are otherwise HTML. An SGML parser (derived from the one
> > in sgmllib) lets pure HTML pass, but fetches and processes the data
> > embedded in these megatags (example below). This is decidedly not pure
> > SGML or pure HTML, but the *code is extremely readable*. Is this what
> > everybody is using the SGMLParser for, is it irrelevant for the matter
> > discussed here, or is this a good idea?
>
> SGML was explicitly designed to allow this and has features to do this
> sort of thing for you. A full SGML parser can interpret your "=" symbol
> and even your newlines as tags. This is very convenient for typists. I
> think that for novice users it will probably be quite confusing,
> however, because people are used to all SGML markup being in clearly
> marked tags, not in ordinary-looking characters. Also to parse something
> like this in Python we would either have to complicate sgmllib or
> introduce another layer of parsing.
Shortly:
1. What's wrong with introducing another layer of parsing?
2. I have reason to doubt that a mixed format will be confusing.
Extended:
ad 1.) I haven't looked at SGML for years and forgot much of what I once
learned. I take it on your word that a full SGML parser can interpret
all kinds of non '<'..'>' embedded tokens. If we are using a python
WYSIWYG editor based on a DTD for these docs, the proposed mixed format
would require a complication of sgmllib.
I have the impression that the consensus is that we are to use a
non-wysiwyg editor for the time being, so this doesn't apply.
Thus I end up with the other option, which, fortunately, is what I was
thinking of in the first place: introduce another layer of parsing.
I can think of no other penalty for this than that the computer works
a little longer when doing the one-time job of converting the dirty-
but-readable manual format into something standard tools can further
process.
ad 2.) I doubt the validity of your assessment of the degree to which a
mixed format is "confusing".
"This is very convenient for typists." -- Indeed, that's what Guido was
referring to when he started this thread.
"I think that for novice users it will probably be quite confusing,
however, because people are used to all SGML markup being in clearly
marked tags, not in ordinary-looking characters." -- Given that we are
talking about the python documentation here, I don't see who those
"novice users" are, who are "used to all SGML markup being in clearly
marked tags".
We all get along with not closing python statements with ';' and not
enclosing blocks in '{'..'}'. I guess that those who write the
documentation will catch up quickly if the documentation is to be written
in some mixed format that looks good, even if it would take an advanced
SGML parser to interpret it in a single step.
cjr
--
Case Roole
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From richardf@redbox.net Sun Nov 16 10:04:28 1997
From: richardf@redbox.net (Richard Folwell)
Date: Sun, 16 Nov 1997 10:04:28 -0000
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <01BCF277.07931140.richardf@redbox.net>
I would like to make a small addition to the Python library documentation [1].
It will not be real soon, say at the start of 1998. What format should I
produce it in?
I am familiar with both SGML and LaTex, but do not currently have either
installed. I have access to some tools (including FrameMaker + SGML) for SGML,
but would have to get hold of TeX. The working platform is NT.
Access to a Unix box for processing material would be possible, but would be
both a real pain (I would have to set up a machine specially) and would make it
almost impossible for me to interest any of my colleagues in the toolset.
I am interested in having a structured text system for my own use (at work we
use Word, for all the usual wrong reasons). It would be nice to use a system
that was in regular use by other people for similar material.
Richard Folwell
[1] Some extra information for people writing sockets code under NT - the
existing material more or less assumes that you are using Unix.
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Sun Nov 16 13:27:05 1997
From: papresco@technologist.com (Paul Prescod)
Date: Sun, 16 Nov 1997 08:27:05 -0500
Subject: [DOC-SIG] Library reference manual debate
References: <199711161008.KAA02434@axiom.bound.xs4all.nl>
Message-ID: <346EF4A9.57A7EC7A@technologist.com>
Case Roole wrote:
> "I think that for novice users it will probably be quite confusing,
> however, because people are used to all SGML markup being in clearly
> marked tags, not in ordinary-looking characters." -- Given that we are
> talking about the python documentation here, I don't see who those
> "novice users" are, who are "used to all SGML markup being in clearly
> marked tags".
Anybody familiar with HTML (in other words, almost everybody). I'm not
dead-set against the idea. If it will help the SGML solution to be more
palatable, then let's do it. I just usually try to avoid inventing my
own language because inevitably some tools (e.g. emacs psgml,
FrameMaker+SGML) will not support it properly, and I have to add more
transformation layers to my publishing process. I find that this is
usually not worth the few keystrokes saved, but intelligent people can
differ on that issue.
My biggest concern would be that these extra layers would be construed
as "extra SGML complications" whereas TIM, having no real popularity at
all, can be extended in an ad hoc manner and thus could be seen to be
more "flexible" than SGML. By that argument, a language I invent
tomorrow would be more "flexible" than Python because it has no
installed base and thus I can change it to be whatever I want. This
"flexibility" leads to an infinite number of contrived, incompatible
languages. So yes, I would rather byte the bullet and use SGML in this
way than invent Yet Another Markup Language (what are we up to, 30, 40
of them?).
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Sun Nov 16 13:36:21 1997
From: papresco@technologist.com (Paul Prescod)
Date: Sun, 16 Nov 1997 08:36:21 -0500
Subject: [DOC-SIG] Library reference manual debate
References: <199711161008.KAA02434@axiom.bound.xs4all.nl>
Message-ID: <346EF6D5.E3552998@technologist.com>
Case Roole wrote:
> "I think that for novice users it will probably be quite confusing,
> however, because people are used to all SGML markup being in clearly
> marked tags, not in ordinary-looking characters." -- Given that we are
> talking about the python documentation here, I don't see who those
> "novice users" are, who are "used to all SGML markup being in clearly
> marked tags".
Anybody familiar with HTML (in other words, almost everybody).
I'm not dead-set against the idea. If it will help the SGML solution to
be more palatable, then let's do it. I just usually try to avoid
inventing my own delimiter language because inevitably some tools (e.g.
emacs psgml, perhaps FrameMaker+SGML) will not support it properly, and
I have to add more transformation layers to my publishing process. I
find that this is usually not worth the few keystrokes saved, but
intelligent people can differ on that issue.
My biggest concern would be that these tool incompatibilities (or
partial compatibilitites) would be construed as "extra SGML
complications" whereas TIM, having no real popularity at all, can be
extended in an ad hoc manner and thus could be seen to be more
"flexible" than SGML. By that argument, a language I invent tomorrow
would be more "flexible" than Python because it has no installed base
and thus I can change it to be whatever I want, but lose the support of
a community and a set of existing tools. This "flexibility" leads to an
infinite number of contrived, incompatible languages. So yes, I would
rather byte the bullet and invent our own delimiter conventions within
SGML rather than invent Yet Another Markup Language.
But just be aware that it will probably cost us in tool compatibility at
some point, and force us to do some extra transformations to a simpler
SGML subset.
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Sun Nov 16 13:39:57 1997
From: papresco@technologist.com (Paul Prescod)
Date: Sun, 16 Nov 1997 08:39:57 -0500
Subject: [DOC-SIG] Library reference manual debate
References: <01BCF277.07931140.richardf@redbox.net>
Message-ID: <346EF7AD.E165401C@technologist.com>
Richard Folwell wrote:
>
> I would like to make a small addition to the Python library documentation [1].
> It will not be real soon, say at the start of 1998. What format should I
> produce it in?
I don't think anybody knows yet (well, maybe Guido).
> I am familiar with both SGML and LaTex, but do not currently have either
> installed. I have access to some tools (including FrameMaker + SGML) for SGML,
> but would have to get hold of TeX. The working platform is NT.
>
> Access to a Unix box for processing material would be possible, but would be
> both a real pain (I would have to set up a machine specially) and would make it
> almost impossible for me to interest any of my colleagues in the toolset.
>
> I am interested in having a structured text system for my own use (at work we
> use Word, for all the usual wrong reasons). It would be nice to use a system
> that was in regular use by other people for similar material.
This all sounds like a vote for SGML to me. :)
* Works on NT
* Easy to install (just unzip Jade and our stylesheet package)
* Structured text
* In regular use by other people
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From skip@calendar.com (Skip Montanaro) Sun Nov 16 13:46:28 1997
From: skip@calendar.com (Skip Montanaro) (Skip Montanaro)
Date: Sun, 16 Nov 1997 08:46:28 -0500 (EST)
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <346EF7AD.E165401C@technologist.com>
References: <01BCF277.07931140.richardf@redbox.net>
<346EF7AD.E165401C@technologist.com>
Message-ID: <199711161346.IAA11571@dolphin.automatrix.com>
> I would like to make a small addition to the Python library documentation
> [1]. It will not be real soon, say at the start of 1998. What format
> should I produce it in?
Sorry I missed this before. I zapped an entire chain in this thread without
reading them. (Gotta get through my mail somehow...) I only noticed it as a
quote in a later message.
You should most definitely use LaTeX. In the .../Doc directory is a
template (libtemplate.tex) for documenting an individual module. Its
comments are quite clear, so it's pretty easy to get things right. Even if
you don't have direct access to LaTeX to check your work, I'm sure Guido or
others who do would much rather start with your rough input than a blank
template.
Skip Montanaro | Musi-Cal: http://concerts.calendar.com/
skip@calendar.com | Python: http://www.python.org/
(518)372-5583 | XEmacs: http://www.automatrix.com/~skip/xemacs/tip.html
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From guido@CNRI.Reston.Va.US Sun Nov 16 15:54:00 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Sun, 16 Nov 1997 10:54:00 -0500
Subject: [DOC-SIG] What I don't like about SGML
Message-ID: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
Here's the background of my dislike for SGML. To confine this
highly flammable material :-), I'm spawning another thread.
First, while SGML may have been standardized in the swinging '80s, it
definitely has its roots in the '70s -- it takes many years to become
an international standard (look at C++!), and it started its life, as
"GML", long before standardization started. Undoubtedly some of the
worse features in SGML were designed to be backwards compatible
(again, very much like C++...).
I am well aware that HTML is SGML conformant since HTML 2.0, and this
is precisely the reason for my concern.
99.9% of the time, HTML is parsed by relatively simple handwritten
parsers, not by generic SGML scanners. There are lots of programs out
there that have to parse HTML -- preprocessors, web browsers, web
spiders, etc. Why don't these just link to an existing SGML scanner?
Because SGML scanners are *huge*. They need to be big to scan generic
SGML, which is a very complex language. But most of this power isn't
needed to scan HTML, so people roll their own parser.
Before HTML had a version number, I wrote an HTML scanner in Python.
It was very simple. Look for < or followed by a letter, then scan
up to a > character, etc. HTML was simple to scan by design: Tim
Berners-Lee wanted HTML and HTTP to be so simple that almost anybody
could write programs that would immediately interoperate with the rest
of the web as it then existed. There is no doubt that this is the
reason that the web took off at all.
But Berners-Lee made one mistake: he made HTML look a bit like SGML
(which he had seen once or twice from a distance :-). Almost
immediately HTML was targeted by the SGML lobby for full compliance.
Here's what was added; all of this made my parser much more
complicated than I think it ought to be (look at how complicated
sgmllib.py is). Note that most of what was added doesn't add
functionality. In one or two cases it even takes away functionality!
It just complicates the scanning process in order to be compatible
with the extremely complicated scanning rules designed for SGML on
punched cards in the 70s.
- A second special character '&' for entity references (original HTML
used to escape "<").
- Character references like or SPACE;.
- Comments in the form of , truly the most atrocious
comment convention invented (and I believe it's worse -- officially,
"--" may not occur inside a comment but "-- --" may, or something like
that; but who cares, as almost no handwritten parser seems to get this
right).
- Special stuff to be ignored, starting with , where it is
tricky to determine what the end is (since sometimes "<" or ">" may
occur inside.
- Special stuff to be ignored, starting with ...>.
- Short tags, (?) which switched to literal copying of the
text until was found. This is impossible to do in SGML --
the best you can do is to switch to literal mode until followed by
a letter is seen, and you can't turn off &ref; processing either.
Of course, with a handwritten parser it is no problem to switch to a
mode that scans for exclusively...
- Why do I have to put quotes around the URL in ???
- Other restrictions on what you can do with attributes; apparently
there's a semantic rule that says that if two unrelated tags have an
attribute with the same name, it must have the same "type".
- A content model, which nobody asked for, and which few people check
for, but which still allows HTML purists to tell you that your HTML
page is "non-conformant" when you place an heading inside a
list item (okay, so I made that up).
- Probably a few other things that nobody asked for, such as the
DTD declaration and SGML's approach to character sets (which is
probably broken -- I believe there is a way to switch character
sets in mid-stream...).
Of course, SGML aficionados will claim that all this was necessary so
that HTML could be processed with SGML, the most powerful and flexible
test processing mechanism available. However, 99% of all HTML written
will never be processed by SGML; it is intended for throw-away
content. Serious SGML users have two other recourses available to
them:
(1) Write everything in SGML and generate HTML from that; I believe
Jade can do this.
(2) Write a simple HTML scanner and convert it to SGML, by hook or by
crook. I believe this is being done too.
So my claim remains that the requirement of SGML conformance is for
99% just a nuisance for parser writers. Of course I'm biased, since
I'm a parser writer myself... So see for yourself what you think of
this argument.
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From fredrik@pythonware.com Sun Nov 16 16:19:57 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 16 Nov 1997 17:19:57 +0100
Subject: [DOC-SIG] What I don't like about SGML
Message-ID: <9711161627.AA02002@arnold.image.ivab.se>
> So my claim remains that the requirement of SGML conformance is for
> 99% just a nuisance for parser writers.
Isn't this the reason they've developed XML? To come up with
a small and simple subset, so that anyone writing an application
can get things right? (Not that they need to, really. It seems as
if all major environments will include built-in parsers before long.
And if you need your own, there's plenty of free implementations
to chose from...)
> Of course I'm biased, since I'm a parser writer myself... So see for
> yourself what you think of this argument.
FWIW, I've had similar experiences with scripting languages...
I started using scripting languages to glue things together in the
early eighties, and developed about a dozen languages of various
flavours. They all had serious limitations, mainly because there
was a lot of stuff that would have taken a lot of effort to get right,
or would have turned out way too slow (you cannot look names all
the time, can you?), or bloated. Finally, I've stumbled upon Python,
and realized that now I never had to write another scriping language,
since someone else had already created something powerful enough
for all my needs, and provided a great implementation for free...
And by some odd reason, I've just experienced the same thing with
text markup languages... Instead of spending more time on edroff
and all the other pod-like stuff I've invented through the years, I
decided to throw them all out and go for SGML/XML, since someone
else had already created something powerful enough for all my needs,
and provided a great implementation for free... (www.jclark.com)
Cheers /F
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From guido@CNRI.Reston.Va.US Sun Nov 16 16:27:43 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Sun, 16 Nov 1997 11:27:43 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: Your message of "Sun, 16 Nov 1997 08:36:21 EST."
<346EF6D5.E3552998@technologist.com>
References: <199711161008.KAA02434@axiom.bound.xs4all.nl>
<346EF6D5.E3552998@technologist.com>
Message-ID: <199711161627.LAA20965@eric.CNRI.Reston.Va.US>
Paul Prescod:
> My biggest concern would be that these tool incompatibilities (or
> partial compatibilitites) would be construed as "extra SGML
> complications" whereas TIM, having no real popularity at all, can be
> extended in an ad hoc manner and thus could be seen to be more
> "flexible" than SGML. By that argument, a language I invent tomorrow
> would be more "flexible" than Python because it has no installed base
> and thus I can change it to be whatever I want, but lose the support of
> a community and a set of existing tools. This "flexibility" leads to an
> infinite number of contrived, incompatible languages. So yes, I would
> rather byte the bullet and invent our own delimiter conventions within
> SGML rather than invent Yet Another Markup Language.
>
> But just be aware that it will probably cost us in tool compatibility at
> some point, and force us to do some extra transformations to a simpler
> SGML subset.
Okay, now we're talking. The issue of layering tools is real. I
expect that no matter which way we go, we will have to craft some
tools of our own. I'm using latex now, and the tools I have crafted
so far are in myformat.sty. In a sense, this is equivalent to a DTD
extension in SGML plus a style sheet. When using TIM, the same thing
is done using a macro file.
Let me try to explain once more why I am hesitant to adopting SGML
(apart from my hang-ups about the lexer, which I discuss in a separate
thread -- they aren't particularly relevant).
I believe that part of Python's success lies in the fact that it has
few dependencies on other tools. For example, it's written in C
rather than C++, and in fact until very recently I made sure that it
was compilable with a K&R C compiler as well as with a Standard C
compiler. What's the advantage of C over C++? When I started Python
as a mostly Unix tool, C++ compilers were still under heavy
development. I expected that many prospective users of the language
would not have a compatible C++ compiler already installed on their
system, and I expected that having to find one that was compatible
with their hardware and O/S would be enough of a deterrent that they
would never use Python unless they were *very* motivated. So I used a
lowest-common-denominator language, K&R C, which at the time came
bundled with every Unix version. I suppose that in 1997 the
availability of C++ compilers is no longer a problem (for example on
the Windows and Mac platforms all C compilers are really C++
compilers) -- but my choice for C was definitely the right one until
recently. A second reason was programmer availability -- again, until
recently, if I had been using C++, it would have been harder for Joe
Average to change a few lines in the Python source to fix a bug and
to send me the diffs.
I am worried that SGML tools are still in a state similar to that of
C++ eight years ago: they exist, but they don't come bundled with any
O/S, and it takes time to track down the right tools for your platform
and then to install them, and you may or may not be successful
depending on what other software you have available. I'm kind of
worried too because the only tool that is used as an existence proof
(Jade) seems to be a one-person project. And of course the XML tools
are still almost completely in the vaporware category.
It has been mentioned that TIM is in the same situation: it's not
widely known or used. However, the one big difference is that all of
TIM consists of three scripts, one of which is already written in
Python (and the other ones could easily be rewritten in Python). So
instead of adding a dependency on a external tools, as with the
adoption of SGML, I would become *independent* of external tools when
I were to adopt TIM. (This is exactly the same reason why the Perl
people did their own, POD.)
I believe that using an adaptation of TIM, it will be possible to
generate HTML *without downloading any additional tools*. I think
this is a huge win, as HTML is all that's needed to preview one's
changes to the manual. To generate PostScript will still require TeX
(and LaTeX and texinfo), but since the existing solution also requires
that, things don't get worse, and of course those who have a need to
use SGML can contribute a translator from TIM to SGML (or, more
likely, to XML, once the XML vaporware solidifies into software).
(Besides, it seems that to get PostScript out of SGML one generally
*also* has to go through TeX, at least on Unix.)
Note that adoption of SGML doesn't mean that we *don't* have to craft
our own tools -- we'll have to come up with a set of definitions (a
DTD extension, is that the right term?) so we can conveniently format
manual entries for functions, classes and methods with default
argument values, keyword arguments, specify argument types, and so on
(which is what the myformat.sty macros are about -- it defines
convenient ways to enter the information about function and method
prototypes). I expect that this particular effort will be about the
same, whether we're using TIM or SGML.
The difference will be that when using TIM, we're encouraging Python
hackers to extend our tool set, while when using SGML, we're
encouraging SGML hackers to extend our tool set. I won't try to guess
which type of hacker is predominant in the world at large; but in the
Python community, I'd say there's no doubt :-)
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From fredrik@pythonware.com Sun Nov 16 16:59:30 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 16 Nov 1997 17:59:30 +0100
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <9711161700.AA31749@arnold.image.ivab.se>
> The difference will be that when using TIM, we're encouraging Python
> hackers to extend our tool set, while when using SGML, we're
> encouraging SGML hackers to extend our tool set. I won't try to guess
> which type of hacker is predominant in the world at large; but in the
> Python community, I'd say there's no doubt :-)
Well, I'd guess that few Python hackers work on text markup languages
and formatting engines, and of those who to do, quite a few seems to
be working with SGML:
http://www.w3.org/XML/9705/hacking
http://www.sil.org/sgml/mcgrathParseDesc.html
and so on...
Since I'm working on SGML/XML for our company's projects, at least I
would much rather contribute to an SGML/XML based effort, than to
hack on TIM. Others milage may vary, as usual ;-)
Cheers /F
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From guido@CNRI.Reston.Va.US Sun Nov 16 18:54:52 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Sun, 16 Nov 1997 13:54:52 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: Your message of "Sun, 16 Nov 1997 17:59:30 +0100."
<9711161700.AA31749@arnold.image.ivab.se>
References: <9711161700.AA31749@arnold.image.ivab.se>
Message-ID: <199711161854.NAA21167@eric.CNRI.Reston.Va.US>
Fredrik Lundh:
> Since I'm working on SGML/XML for our company's projects, at least I
> would much rather contribute to an SGML/XML based effort, than to
> hack on TIM. Others milage may vary, as usual ;-)
Fredrik, I'm afraid that you're already overcommitted -- I'd hate to
see the schedule for your book jeopardized. (I think it is your
highest priority from the Python community's point of view.)
Otherwise, I'd challenge you to get started -- I'm sure you'd do a
great job. Here's the challenge anyway -- maybe someone else can pick
it up. I'm tired of hearing what *I* should do. I've already hinted
on what I *would* do if I had to do it. I'm more interested in
hearing from people who have done something that I (and the rest of
the Python community) can use. "Use SGML" is not a productive
approach; "this is what I did using SGML" is.
What would be needed, at least at the proof of concept level, is a
tool that does the one-time conversion of a library manual section or
chapter to SGML, plus entries in Doc/Makefile that automatically
produce PostScript and HTML from the SGML. Since these are the output
formats that are currently supported, it makes sense to require that
they are both supported by any proposed new system before it is
judged. Knowing in the abstract that SGML can be converted to HTML
and PostScript isn't enough -- I want to see the generated HTML and
PostScript so that I (and others) can judge how good it is and what
still needs to be done.
As a concrete test, the Python library manual is full of sections like
this one:
\begin{funcdesc}{sub}{pattern\, repl\, string\optional{, count=0}}
Return the string obtained by replacing the leftmost non-overlapping
occurrences of \var{pattern} in \var{string} by the replacement
\var{repl}, which can be a string or the function that returns a
string. If the pattern isn't found, \var{string} is returned
unchanged. The pattern may be a string or a regexp object; if you need
to specify regular expression flags, you must use a regexp object, or
use embedded modifiers in a pattern string; e.g.
%
\bcode\begin{verbatim}
sub("(?i)b+", "x", "bbbb BBBB") returns 'x x'.
\end{verbatim}\ecode
%
The optional argument \var{count} is the maximum number of pattern
occurrences to be replaced; count must be a non-negative integer, and
the default value of 0 means to replace all occurrences.
Empty matches for the pattern are replaced only when not adjacent to a
previous match, so \code{sub('x*', '-', 'abc')} returns '-a-b-c-'.
\end{funcdesc}
How should this be translated to SGML? Which DTD should be used? I'm
not particularly happy with the way the argument list has to be
formatted in LaTeX, especially when optional or keyword arguments are
present -- can SGML do better?
For comparison, here's an example of a complex function description in
TIM:
@deffn Function ORB_init (@metavar{argv}=(), @metavar{orb_id}='ilu')
@ftindex CORBA.ORB_init (Python LSR function)
Returns an instance of @class{@Python{CORBA.ORB}} with the specified
@metavar{orb_id} (currently only the ORB ID @Python{'ilu'} is
supported). The arguments which may be passed in via @metavar{argv}
are ignored.
@end deffn
Note that one feature I like is that the LaTeX {funcdesc} environment
automatically creates the index entry for the function; it combines it
with some information provided earlier in the file:
\renewcommand{\indexsubitem}{(in module re)}
I see that this is done manually in TIM (although I'm not sure why).
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From fredrik@pythonware.com Sun Nov 16 20:08:11 1997
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 16 Nov 1997 21:08:11 +0100
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <9711162009.AA20858@arnold.image.ivab.se>
> Otherwise, I'd challenge you to get started -- I'm sure you'd do a
> great job.
Well, the thing is that I have to do this anyway (not that bad,
since I get paid to do it); if I can get you on the "right track",
I might be able to contribute without having to work extra
shifts ;-)
> I'm more interested in hearing from people who have done some-
> thing that I (and the rest of the Python community) can use. "Use
> SGML" is not a productive approach; "this is what I did using SGML"
> is.
Okay, folks. Time to:
1. Settle on a DTD. Can we use DocBook as is? What extensions
are needed? (Paul? Fred?)
2. Write a small "howto" document; maybe just a sample page
showing how to format a typical libref chapter. maybe also
a "howto" on how to efficiently use emacs' SGML mode.
3. Hack a customized Tex to SGML converter (anyone has any
code for this?)
4. (initially) use Jade for the initial conversions to RTF/PS and
HTML, using Norm Walsh's DSSSL stylesheets (Paul?)
Now, since learning scheme (DSSSL) is more than I have time
for, I'll also propose the following projects:
5. write an SGML to XML converter using Grail's SGMLparser
(in the meantime, we can use James Clark's "sx" tool)
6. write an XML parser (at least a tokenizer) that some day
could be included in the standard Python distribution (almost
done!)
7. write an XML to HTML tool based on (6) and a "Python style
sheet" (almost done!)
8. write an XML to PostScript tool based on (6), the printer
formatter from edroff, and PIL's PSDraw (or maybe we could
use html2ps?)
9. write an XSL stylesheet for XML-aware browsers.
10. etc.
11. etc.
12. etc.
Or maybe fuse 5 and 6. But dealing with XML is much easier;
an XML parser written in C could be added to Python without
anyone noticing...
And given modules 5-7, we'll end up with the 100% pure python
solution that I guess we all would prefer...
So, what do you think?
Cheers /F
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From scott@chronis.icgroup.com Mon Nov 17 02:56:36 1997
From: scott@chronis.icgroup.com (Scott)
Date: Sun, 16 Nov 1997 21:56:36 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <9711162009.AA20858@arnold.image.ivab.se>; from Fredrik Lundh on Sun, Nov 16, 1997 at 09:08:11PM +0100
References: <9711162009.AA20858@arnold.image.ivab.se>
Message-ID: <19971116215636.23680@chronis.icgroup.com>
Seems like there's a lot of great works getting underway here. I'm certain
they would/will add alot to what python can do.
As far as my own opinion for what what format the library reference should
take, I have two major concerns. First is that the first beta release of
python 1.5 is not significantly delayed for whatever is decided should be done
to change the current docs. There's alot of work in 1.4 out there that would
benefit greatly from a public 1.5 release (and I'm tired of wanting to write
code that will only work in 1.5 when in production it will have to in 1.4).
Second is that html be really easy to access. For the library reference in
particular, I find that I'd much rather click search a couple of times than
page through sheet after sheet of hardcopy.
Finally, though I'm not all that familiar with the particulars of parsing
tim/sgml/xml/latex, I would like to point out that I've made some progress
towards easily producing generally efficient parsers in python (Lex/Yacc/Bison
style). Should anyone working on any python documentation projects want such a
tool before it's ready for its first public release, there is pre release info
on the string sig and under
http://starship.skyport.net/crew/scott/projects.html. Using the prelease code
toward this end, or offering suggestions as to how it could better accomodate
this end is most welcome.
scott
On Sun, Nov 16, 1997 at 09:08:11PM +0100, Fredrik Lundh wrote:
| > Otherwise, I'd challenge you to get started -- I'm sure you'd do a
| > great job.
|
| Well, the thing is that I have to do this anyway (not that bad,
| since I get paid to do it); if I can get you on the "right track",
| I might be able to contribute without having to work extra
| shifts ;-)
|
| > I'm more interested in hearing from people who have done some-
| > thing that I (and the rest of the Python community) can use. "Use
| > SGML" is not a productive approach; "this is what I did using SGML"
| > is.
|
| Okay, folks. Time to:
|
| 1. Settle on a DTD. Can we use DocBook as is? What extensions
| are needed? (Paul? Fred?)
|
| 2. Write a small "howto" document; maybe just a sample page
| showing how to format a typical libref chapter. maybe also
| a "howto" on how to efficiently use emacs' SGML mode.
|
| 3. Hack a customized Tex to SGML converter (anyone has any
| code for this?)
|
| 4. (initially) use Jade for the initial conversions to RTF/PS and
| HTML, using Norm Walsh's DSSSL stylesheets (Paul?)
|
| Now, since learning scheme (DSSSL) is more than I have time
| for, I'll also propose the following projects:
|
| 5. write an SGML to XML converter using Grail's SGMLparser
| (in the meantime, we can use James Clark's "sx" tool)
| 6. write an XML parser (at least a tokenizer) that some day
| could be included in the standard Python distribution (almost
| done!)
| 7. write an XML to HTML tool based on (6) and a "Python style
| sheet" (almost done!)
| 8. write an XML to PostScript tool based on (6), the printer
| formatter from edroff, and PIL's PSDraw (or maybe we could
| use html2ps?)
| 9. write an XSL stylesheet for XML-aware browsers.
| 10. etc.
| 11. etc.
| 12. etc.
|
| Or maybe fuse 5 and 6. But dealing with XML is much easier;
| an XML parser written in C could be added to Python without
| anyone noticing...
|
| And given modules 5-7, we'll end up with the 100% pure python
| solution that I guess we all would prefer...
|
| So, what do you think?
|
| Cheers /F
|
|
| _______________
| DOC-SIG - SIG for the Python Documentation Project
|
| send messages to: doc-sig@python.org
| administrivia to: doc-sig-request@python.org
| _______________
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From guido@CNRI.Reston.Va.US Mon Nov 17 02:57:38 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Sun, 16 Nov 1997 21:57:38 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: Your message of "Sun, 16 Nov 1997 21:08:11 +0100."
<9711162009.AA20858@arnold.image.ivab.se>
References: <9711162009.AA20858@arnold.image.ivab.se>
Message-ID: <199711170257.VAA21457@eric.CNRI.Reston.Va.US>
> Well, the thing is that I have to do this anyway (not that bad,
> since I get paid to do it); if I can get you on the "right track",
> I might be able to contribute without having to work extra
> shifts ;-)
OK, you have the green light! (While you're at it, could you design a
set of macros for api.tex too? That's my next big project at the
moment.)
> 6. write an XML parser (at least a tokenizer) that some day
> could be included in the standard Python distribution (almost
> done!)
Sjoerd Mullender has written one already. Sjoerd, would you mind
announcing your xml parser somewhere?
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Fred L. Drake, Jr."
References: <9711162009.AA20858@arnold.image.ivab.se>
Message-ID: <199711170449.XAA29856@weyr.cnri.reston.va.us>
Fredrik Lundh writes:
> 1. Settle on a DTD. Can we use DocBook as is? What extensions
> are needed? (Paul? Fred?)
Unless Paul has some specific objections, I think we should at least
start with the standard docbook DTD. We can adjust if we find
problems with toolsets or complexity.
> 3. Hack a customized Tex to SGML converter (anyone has any
> code for this?)
I can work on this, as I've whacked around in the old partparse.py
somewhat. I'll look for other alternatives before I start whacking on
it again.
> 5. write an SGML to XML converter using Grail's SGMLparser
> (in the meantime, we can use James Clark's "sx" tool)
This shouldn't be too onerous.
> 8. write an XML to PostScript tool based on (6), the printer
> formatter from edroff, and PIL's PSDraw (or maybe we could
> use html2ps?)
html2ps.py would probably be a good approach if we want to use
Python-only tools, though I suspect a jade->TeX->dvips conversion
would look better. There's still a lot of things html2ps doesn't
support, and it's already quite slow. As much as I think I can
improve it, it's just not there for this kind of thing. (And I do
have ways to convert serious multi-page HTML documents using html2ps,
with a lot of the frills. Trust me, it's just not in there to replace
TeX for formatting these things.)
-Fred
--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA 20191-5434
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Fred L. Drake, Jr."
References: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
Message-ID: <199711170457.XAA29863@weyr.cnri.reston.va.us>
Guido van Rossum writes:
> First, while SGML may have been standardized in the swinging '80s, it
> definitely has its roots in the '70s -- it takes many years to become
> an international standard (look at C++!), and it started its life, as
> "GML", long before standardization started. Undoubtedly some of the
> worse features in SGML were designed to be backwards compatible
Have you used GML? I have. It was probably nice when it was new,
but certainly was showing problems by the time I used it. Script/VS
(the processor I used) also allowed "control words" which looked a lot
like troff dot-commands. I ended up using a lot of these because the
mechanisms for defining new logical markup were very poorly documented
as far as I could tell. I had to define macros on top of the
Script/VS control words.
The SGML is see now has definately evolved a long way from those
roots, though the better aspects of GML are still there (structure).
I don't think the GML background of SGML can be meaningfully held up
as a problem with SGML; I think Goldfarb learned a lot from GML's
failures when by the time SGML was defined.
-Fred
--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA 20191-5434
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Mon Nov 17 08:59:49 1997
From: papresco@technologist.com (Paul Prescod)
Date: Mon, 17 Nov 1997 03:59:49 -0500
Subject: [DOC-SIG] Library reference manual debate
References: <9711162009.AA20858@arnold.image.ivab.se> <199711170449.XAA29856@weyr.cnri.reston.va.us>
Message-ID: <34700785.E8472EE6@technologist.com>
Fred L. Drake wrote:
>
> Fredrik Lundh writes:
> > 1. Settle on a DTD. Can we use DocBook as is? What extensions
> > are needed? (Paul? Fred?)
>
> Unless Paul has some specific objections, I think we should at least
> start with the standard docbook DTD. We can adjust if we find
> problems with toolsets or complexity.
My only concern is with trying to do too much at once. If we can just
get the book into some SGML variant, no matter how bizarre, then our
life becomes easier because we can use either Python or Jade for further
transformations. Once we are there, we can aim for DocBook.
> I can work on this, as I've whacked around in the old partparse.py
> somewhat. I'll look for other alternatives before I start whacking on
> it again.
Right, this again seems to argue in favour of doing the TeX->SGML step
separate from the SGML->DocBook step. You can do TeX->SGML with no help
and without consulting the DocBook DTD. Your only constraint is "don't
lose information." I can do the SGML->DocBook as part of a collaborative
project with discussion on the features and markup we need.
Pual Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Sjoerd.Mullender@cwi.nl Mon Nov 17 09:58:20 1997
From: Sjoerd.Mullender@cwi.nl (Sjoerd Mullender)
Date: Mon, 17 Nov 1997 10:58:20 +0100
Subject: [DOC-SIG] ANNOUNCE: XML parser library
Message-ID:
I have written an XML (eXtensible Markup Language) parser module for
Python. This module is derived from the SGML parser (sgmllib) and has
a similar flavor.
xmllib is available from the following sites:
ftp://ftp.cwi.nl/pub/sjoerd/xmllib.tar.gz
http://www.cwi.nl/ftp/sjoerd/xmllib.tar.gz
ftp://ftp.starship.skyport.net/pub/crew/sjoerd/xmllib.tar.gz
In all places there is also a file xmllib.README (which is also part
of the distribution).
Since the module uses the new re module, it will only work if you have
that already. The re module is standard in Python 1.5alpha.
For information on XML see .
-- Sjoerd Mullender
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From skaller@zip.com.au Mon Nov 17 15:13:30 1997
From: skaller@zip.com.au (John Skaller)
Date: Tue, 18 Nov 1997 02:13:30 +1100
Subject: [DOC-SIG] Library reference manual debate
Message-ID: <1.5.4.32.19971117151330.0092f458@zip.com.au>
It would seem to me the first step to improve documentation
would be to create a mechanism for people to submit and
retrieve it.
My opinion is that fixing a format is the best way to exclude
most potential submitters. But if a format has to be
picked, it had better be ordinary old HTML, so it can
be put up on a website and used immediately by everyone.
The tree and subtrees should be available compressed.
That can be done automatically by some newer ftp servers.
Not everyone is online all the time!
I want to click on a link, download the whole
thing, unpack it into my web server, add a link to my
home page, and I get a mirror.
HTML is little use for typesetting books, but individual
authors are NOT going to agree on any standard for print
media. They're going to use whatever method they have that
works and their publisher is happy with.
I need to convert my "text-for-printmedia"
into something people can browse. So I'm trying to get
LaTeX2HTML running. It complains about my fancy packages.
It can somehow take "snapshots" -- by magic it seems to me --
of bits it can't understand, but this will only
work on MY system. So the only person who can convert
my source to HTML is ME.
I really _have_ to get that working. I can't write HTML
at all.
---------------------------------------------------------------
I can envisage a much more sophisticated system, which
accepts all kind of documents and converts them
to other formats as required.
Where are we going to get programmers who can do this
work without the documentation for them to learn Python?
WHO is going to convert submitted LaTeX to HTML?
So, I write a doc using Guido's latex style.
How long until someone converts it and posts it
to the website?
To start off, why not accept documents in
_several_ formats. HTML, Postscript, dvi, and perhaps
a Guido-restricted LaTeX -- assuming Guido is
willing to do the conversion. No? Then we can't
accept that format.
-------------------------------------------------------
John Skaller email: skaller@zip.com.au
http://www.zip.com.au/~skaller
phone: 61-2-6600850
snail: 10/1 Toxteth Rd, Glebe NSW 2037, Australia
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Edward Welbourne Mon Nov 17 14:09:13 1997
From: Edward Welbourne (Edward Welbourne)
Date: Mon, 17 Nov 1997 14:09:13 GMT
Subject: [DOC-SIG] doc strings
Message-ID: <9711171409.AA27032@lslr6g.lsl.co.uk>
OK, I accept that it's better to stick with the gendoc `structured text'
approach rather than using *ML, TIM or anything else in python doc
folds, even if that does mean I've now got a fair slice of HTML to
retro-convert in existing code. I'll come back (if I remember) to how
we might be able to improve on this ...
First off, I want the doc string extractor to be able to decipher `type'
and `default' information from the centre-piece of every doc string I've
written - the argument list. Example (in gendoc's form):
Arguments:
file -- file-name string. The name of the file to open.
[mode] -- ('r') I/O mode string. ...
[bufsiz] -- (-1) integer. Buffer size to use for I/O, if bufsize >
1: a value of 1 requests line buffering, 0 requests unbuffered I/O and
any negative value requests the implementation's default.
It might be worth recognising the word `argument' or `arguments', along
similar lines to `example', if we can think of a common format for
`default and type' information, possibly exploiting the fact that this
will take the form of the list item's first `sentence'. If default is
there, it's the first thing after -- and is enclosed in (). Next comes
type information, up to `end of sentence'. The rest of the paragraph
might be worth typesetting as a separate paragraph in the , as if
(in the HTML output to be generated from the above)
Arguments:
- file
- file-name string.
The name of the file to
open.
...
Note that if -- is followed by (...) and ... happens to contain a match
to the pattern being used to detect `end of sentence', it shouldn't be
counted as such because it's inside the default spec.
Here's another sample of a docstring (gde.process.execute.__doc__,
wrapping os.execv and os.execve up) to illustrate use of gendoc's
`structured text' for the uninitiated. Note that the method's name and
the fact that it's a method (of a class called gde._Process, as it
happens, but it should be documented as a method of the value
gde.process) can be extracted by the tools which crawl over the
namespace digging out the doc-strings and gluing them together. Note
that the method also has a `self' argument which doesn't appear here.
"""Replaces the current process.
Required argument, file, is the pathname of a file, which must be
executable, to be executed. The resulting process will replace the
current process. Optional arguments:
args -- ([]) list of strings. The arguments to be passed to file.
env -- (None) dictionary, mapping strings to strings. If omitted (or
given as None), the new process will run in the same environment as the
old one; otherwise, env gives the environment in which the new process
is to run.
*Does not return.*"""
Eddy.
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From guido@CNRI.Reston.Va.US Mon Nov 17 14:36:02 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Mon, 17 Nov 1997 09:36:02 -0500
Subject: [DOC-SIG] What I don't like about SGML
In-Reply-To: Your message of "Sun, 16 Nov 1997 23:57:19 EST."
<199711170457.XAA29863@weyr.cnri.reston.va.us>
References: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
<199711170457.XAA29863@weyr.cnri.reston.va.us>
Message-ID: <199711171436.JAA22253@eric.CNRI.Reston.Va.US>
Fred Drake:
> Have you used GML? I have. It was probably nice when it was new,
> but certainly was showing problems by the time I used it. Script/VS
> (the processor I used) also allowed "control words" which looked a lot
> like troff dot-commands. I ended up using a lot of these because the
> mechanisms for defining new logical markup were very poorly documented
> as far as I could tell. I had to define macros on top of the
> Script/VS control words.
> The SGML is see now has definately evolved a long way from those
> roots, though the better aspects of GML are still there (structure).
> I don't think the GML background of SGML can be meaningfully held up
> as a problem with SGML; I think Goldfarb learned a lot from GML's
> failures when by the time SGML was defined.
Hm, I'm not sure if we're talking about the same GML then. According
to Goldfarb's home page (http://www.sgmlsource.com/):
- For history buffs, some reliable papers on the early history of SGML
and its precursor, GML. I invented SGML in 1974, and led the technical
efforts of several hundred people for a dozen years that developed it
into its present form as an International Standard. You can read some
of that story in the SGML History Niche.
Anyway, this was just in response to Paul Prescod. I claimed (and
still claim) that SGML's input methods have its roots in punched
cards. Paul responded that it was standardized in 1986, when PCs were
common. Goldfarb's remark indicates that SGML is much older than
that...
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Edward Welbourne Mon Nov 17 14:45:46 1997
From: Edward Welbourne (Edward Welbourne)
Date: Mon, 17 Nov 1997 14:45:46 GMT
Subject: [DOC-SIG] doc strings could be in a variety of formats
Message-ID: <9711171445.AA14694@lslr6g.lsl.co.uk>
I said earlier that ...
> I'll come back (if I remember) to how we might be able to improve on
> this ...
Doc strings are principally important to the author of the python code
in which they appear. Their secondary importance is that they can be
extracted by a uniform toolset (in python) into, at least, HTML.
Imagine a standard set of classes which describe a common form into
which our doc strings are to be parsed by the common toolset. We can
write simple parsers from a few variants on the doc-string format into
this internal form. If a module or class sets its __docform__ tag to an
object with a .parse(string) method, the tools crawling the namespace to
extract docs can notice this and use that __docform__ as the parser for
the doc strings in the module or class. The onus of supporting a new
doc-string format falls on those who depart from the fold, but they get
the option if they're prepared to take that effort.
Does gendoc contain classes which represent the parsed strings ? Does
it provide such a __docform__ object which might serve as the default ?
With this sort of setup, those of us who like HTML can write our doc
strings in HTML (provided we're willing to write ourselves a parser for
it) for use in place of the gendoc one: as for HTML, so for any of the
myriad of doc forms out there in the world. Furthermore, it shouldn't
be too hard for someone fed up (in interactive sessions) with
decoding some other contributor's doc strings to write something which
turns the parsed doc-strings back into their own preferred format of
doc-strings for display (chose your own __str__ method for the object
produced by a __docform__).
With a scheme like this, we can have our cake and eat it.
Please, someone, tell me what I've missed ;^>
Eddy.
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From guido@CNRI.Reston.Va.US Mon Nov 17 14:52:45 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Mon, 17 Nov 1997 09:52:45 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: Your message of "Mon, 17 Nov 1997 03:59:49 EST."
<34700785.E8472EE6@technologist.com>
References: <9711162009.AA20858@arnold.image.ivab.se> <199711170449.XAA29856@weyr.cnri.reston.va.us>
<34700785.E8472EE6@technologist.com>
Message-ID: <199711171452.JAA22293@eric.CNRI.Reston.Va.US>
Paul Prescod :
> Right, this again seems to argue in favour of doing the TeX->SGML step
> separate from the SGML->DocBook step. You can do TeX->SGML with no help
> and without consulting the DocBook DTD. Your only constraint is "don't
> lose information." I can do the SGML->DocBook as part of a collaborative
> project with discussion on the features and markup we need.
What is the use of SGML without a DTD? Can one still create HTML and
Postscript from it?
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Fred L. Drake, Jr."
References: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
<199711170457.XAA29863@weyr.cnri.reston.va.us>
<199711171436.JAA22253@eric.CNRI.Reston.Va.US>
Message-ID: <199711171455.JAA00324@weyr.cnri.reston.va.us>
Guido van Rossum writes:
> Hm, I'm not sure if we're talking about the same GML then. According
> to Goldfarb's home page (http://www.sgmlsource.com/):
This is the same one. The machinery for defining processing
separately from the abstract markup was there, but you pretty much had
to be an IBM insider to get enough information about how to use it.
That's why the Script/VS control words got used as much as they did.
I agree; the original format of the markup would have been better left
on the punched cards. But it was sufficient for me to write about 150
pages of a software manual (user info. and configuration). I had more
problems dealing with XEdit than the markup itself, but too much of
the markup ended up being process-oriented than it should have been.
Very tedious stuff, indeed.
A lot like LaTeX in many ways, but it's much easier to extend the
LaTeX markup than the GML markup.
-Fred
--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA 20191-5434
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Fred L. Drake, Jr."
References: <9711162009.AA20858@arnold.image.ivab.se>
<199711170449.XAA29856@weyr.cnri.reston.va.us>
<34700785.E8472EE6@technologist.com>
<199711171452.JAA22293@eric.CNRI.Reston.Va.US>
Message-ID: <199711171502.KAA00350@weyr.cnri.reston.va.us>
Guido van Rossum writes:
> What is the use of SGML without a DTD? Can one still create HTML and
> Postscript from it?
There must be a DTD, even if it doesn't get written down. (That's
often referred to as "well-formed" XML. ;)
Paul is referring to a long-standing convention of converting
between document types in incremental steps, to allow each step to be
simple to implement and check. It should not be too hard to do; my
main concern is that the DTDs for intermediate stages must be
sufficiently well understood that the conversions don't lose
information by accident. This will probably mean, if the DTDs aren't
written and at least partially documented, that one person will handle
the entire process until the target DTD is reached. Perhaps this is
O.K., perhaps not. Others should be able to repeat the process by
running the same sequence of scripts over the same input data.
-Fred
--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA 20191-5434
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Edward Welbourne Mon Nov 17 16:09:22 1997
From: Edward Welbourne (Edward Welbourne)
Date: Mon, 17 Nov 1997 16:09:22 GMT
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711160451.XAA29095@weyr.cnri.reston.va.us>
References: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>
<199711160451.XAA29095@weyr.cnri.reston.va.us>
Message-ID: <9711171609.AA35590@lslr6g.lsl.co.uk>
Guido:
>> ain't broken." I personally have access to a working LaTeX
>> installation, the latex2html converter produces adequate HTML (I still
Fred:
> It's out of date and should be updated, but does work for the Python
> documentation. I have found very reasonable LaTeX2e documents that
> can't be formatted correctly using the CNRI installation.
Yup, I used to be a TeXnician but it's so long since I've had a
non-fragile installation that I've given up on it. I endure LaTeX for
legacy reasons, but I don't write in it except where I have to.
This is genuinely a major problem with LaTeX: once a site has got a few
hacked .sty files, you can forget about the portability of LaTeX. The
entire TeX system is too brittle, for this it will die.
Pity: I still have a lot of fondness for it.
It was a programmer's documentation form.
Eddy.
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Fred L. Drake, Jr."
References: <9711161700.AA31749@arnold.image.ivab.se>
<199711161854.NAA21167@eric.CNRI.Reston.Va.US>
Message-ID: <199711171626.LAA00457@weyr.cnri.reston.va.us>
Guido van Rossum writes:
> hearing from people who have done something that I (and the rest of
> the Python community) can use. "Use SGML" is not a productive
> approach; "this is what I did using SGML" is.
Guido,
I did want to comment on this. So far, I haven't seen anyone write
that you should do the work to switch over to SGML or anything else.
I think that Paul and I, perhaps with additional collaborators if
anyone is interested and can squeeze out the time, can muster the
technical expertise to do the work.
The issue is: Are you willing to consider using an SGML/XML based
solution as the canonnical form for the documentation if handed to you
on a silver platter and it meets the requirements? Are you willing to
help us review the requirements to be sure we aren't leaving anything
out that's in there now, or that really needs to be in there? This is
a question that needs to be answered. From another message you wrote,
this may be the case, but I'm not certain I didn't misinterpret.
-Fred
--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA 20191-5434
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Fred L. Drake, Jr."
References: <199711161008.KAA02434@axiom.bound.xs4all.nl>
<346EF6D5.E3552998@technologist.com>
<199711161627.LAA20965@eric.CNRI.Reston.Va.US>
Message-ID: <199711171633.LAA00494@weyr.cnri.reston.va.us>
Guido van Rossum writes:
> worried too because the only tool that is used as an existence proof
> (Jade) seems to be a one-person project. And of course the XML tools
> are still almost completely in the vaporware category.
SP, the parser underlying Jade, is James Clark's second complete
SGML parser. From the discussions in comp.text.sgml, I'd say it's
well regarded as a world-class piece of software which is being used
in commercial software as well as all sorts of ad-hoc applications.
Clark is also the author of groff, the GNU roff/troff tool. I don't
think there's any reason to be concerned about the source of this
software.
-Fred
--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA 20191-5434
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From guido@CNRI.Reston.Va.US Mon Nov 17 16:37:04 1997
From: guido@CNRI.Reston.Va.US (Guido van Rossum)
Date: Mon, 17 Nov 1997 11:37:04 -0500
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: Your message of "Mon, 17 Nov 1997 11:26:19 EST."
<199711171626.LAA00457@weyr.cnri.reston.va.us>
References: <9711161700.AA31749@arnold.image.ivab.se> <199711161854.NAA21167@eric.CNRI.Reston.Va.US>
<199711171626.LAA00457@weyr.cnri.reston.va.us>
Message-ID: <199711171637.LAA22663@eric.CNRI.Reston.Va.US>
Fred Drake:
> Guido,
> I did want to comment on this. So far, I haven't seen anyone write
> that you should do the work to switch over to SGML or anything else.
> I think that Paul and I, perhaps with additional collaborators if
> anyone is interested and can squeeze out the time, can muster the
> technical expertise to do the work.
> The issue is: Are you willing to consider using an SGML/XML based
> solution as the canonnical form for the documentation if handed to you
> on a silver platter and it meets the requirements? Are you willing to
> help us review the requirements to be sure we aren't leaving anything
> out that's in there now, or that really needs to be in there? This is
> a question that needs to be answered. From another message you wrote,
> this may be the case, but I'm not certain I didn't misinterpret.
I am not rejecting SGML unseen. If it gets handed to me on a silver
platter I will review it. Until very recently I hadn't heard anyone
volunteer anything, just a lot of arguing (including my own :-).
This has changed. I am still skeptical about how easy it will be for
Joe Random Contributor to contribute documentation (this means both
the input format and the tools to produce at least one HTML or
PostScript so they can review what they are contributing) so I think
that's where the pudding's proof will have to be.
--Guido van Rossum (home page: http://www.python.org/~guido/)
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Mon Nov 17 16:58:46 1997
From: papresco@technologist.com (Paul Prescod)
Date: Mon, 17 Nov 1997 11:58:46 -0500
Subject: [DOC-SIG] docstrings for args
Message-ID: <347077C6.46C3F735@technologist.com>
Would it be useful to be able to do this:
def open( file "The filename",
mode="r" "The mode" ):
"Open the file"
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Fred L. Drake, Jr."
References: <347077C6.46C3F735@technologist.com>
Message-ID: <199711171708.MAA00794@weyr.cnri.reston.va.us>
Paul Prescod writes:
> Would it be useful to be able to do this:
>
> def open( file "The filename",
> mode="r" "The mode" ):
^^^^^^^^^^^^^^^^
This is equivalent to mode="rThe mode" due to the string catenation
rule. Some other form of separation would be necessary.
A few sets of conventions exist for formatting the docstring such
that it can be picked apart. I think Guido suggested most recently:
def open(file, mode="r"):
"""A short synopsis first.
file -- The filename
mode -- The mode
More descriptive text here...."""
This is fairly readable and can be dealt with fairly easily by an
automatic extractor. I have a class that parses docstrings like this;
I'll try and clean up a few rough edges and document it over the next
day or two.
-Fred
--
Fred L. Drake, Jr.
fdrake@cnri.reston.va.us
Corporation for National Research Initiatives
1895 Preston White Drive
Reston, VA 20191-5434
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From Robin.K.Friedrich@USAHQ.UnitedSpaceAlliance.com Mon Nov 17 22:40:15 1997
From: Robin.K.Friedrich@USAHQ.UnitedSpaceAlliance.com (Friedrich, Robin K)
Date: Mon, 17 Nov 1997 16:40:15 -0600
Subject: [DOC-SIG] doc strings
Message-ID:
To follow up to Ed's doc string comments let me restate the latest
structured test proposal. I will not comment on an API for this meta-
information; that's a subject for another thread once we have agreed
to the contents of doc strings. This discussion follows from the
need for doc strings substantive enough to allow a reasonable user's
guide to be automatically generated with tools such as gendoc.
That places it roughly in Guido's 4th category of doc string use.
For those who have not read the DOC SIG web page recently this posting
will try to encapsulate all current structured text capabilities as
working in gendoc 0.6 as well as some proposed enhancements.
[but please do read http://www.python.org/sigs/doc-sig/]
"""This is a one line description of the functionality.
A more detailed discussion of the object's function and purpose
may follow. Otherwise the author may choose to jump right into
the object's prototype information. I will identify optional key
text used for parsing in [bracketed] notation. As it stands now,
brackets have no special meaning. Subordinate paragraphs are
simply indented. Structured text currently has some implied rules
for determining what paragraphs are tagged as headers. (I for one
would like to see some more attention paid to that detail.) Any
special characters like '<' for HTML need not be escaped in any
way as the renderer will be responsible for that.
*Note* that "hyperlinks" are delimited by double quotes. In effect
what this does is cause the string in the quotes to be saved off for
further comparison to the URL lines at the end of the doc string. For
an HTML rendering the quotes themselves would be removed and any
quoted text which doesn't match exactly will be left alone.
Text on a single line surrounded by asterisks are tagged as
'emphasis' (italic in HTML), while text wanting to be **loud** is
surrounded by double asterisks and are tagged 'strong' (bold in
"HTML" viewers).
Class objects should only document the class interface and leave the
method doc to those individual doc strings. (This is not a hard rule
as I've seen many coders insist on placing everything in the class
doc.)
The following is new structure to support identification of function
prototype information.
[Required] Argument[s]:
arg1 -- String containing a source file name.
arg2 -- String containing a target file name.
Optional Argument[s]:
arg3 -- (-1) Integer defaulted to -1 as shown by the parenthesis.
Other text not having a double dash will be appended line's
string; otherwise it would signify a new def list item. Ed
marked optional arguments with brackets. Since any argument
with a default value is optional this may not be necessary.
arg4 -- ('') String. This is identified as new list item without
having to have a blank line separating them. For long lists
this is important. Note also that single quotes indicate
literal (code) text and great care must be taken in the
parser to get this right. We might discuss alternative
notation for literal strings.
Keyword Argument[s]:
opt1 -- (1) Defaults for python keywords are implemented in code so
they cannot be extracted for the function declaration.
opt2 -- (None) These lists can get mighty long.
Return Value:
Tuple pair (perigee, apogee).
Example Usage:
Any line ending in a colon containing the string 'example' will flag
the following indented paragraphs as preformatted code until
indentation returns to the next leftward level. This is not
structured text protocol currently but is just an idea. This differs
from the single quote usage which is just intended for short literal
text not spanning lines.
* Bulleted lists can appear at any level of indentation and can be
identified by either a '*', 'o', or a '-' as the first nonwhite
character on the line.
* Indented bullet paragraphs are rendered accordingly.
* Bulleted list items need not be separated by blank lines.
* This is another item that's not easy to parse as a paragraph may
start with *emphasized text*.
.. "hyperlinks" http://www.python.org/sigs/doc-sig/
.. "HTML" http://www.w3c.org/
"""
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From janssen@parc.xerox.com Mon Nov 17 21:43:07 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Mon, 17 Nov 1997 13:43:07 PST
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>
References: <199711152030.PAA19793@eric.CNRI.Reston.Va.US>
Message-ID:
Excerpts from ext.python: 15-Nov-97 [DOC-SIG] Library reference.. Guido
van Rossum@CNRI.Re (7617)
> it should be simple enough to rewrite
> the TIM-to-HTML converter in Python (maybe using HTMLgen?).
Probably make more sense to do a TIM-to-XML script in Python...
Excerpts from ext.python: 15-Nov-97 [DOC-SIG] Library reference.. Guido
van Rossum@CNRI.Re (7617)
> The other one is much hairier: conversion of the existing LaTeX source
> to TIM!
The first thing we'd need is a definition of the various markup terms we
wanted to use. For example, is it better to say "\code" and assume
Python code, or "\python" so that we can also say "\C"? Should we say
(as the lib sources currently do) "\code" for everything, or do we want
to distinguish the names of functions, say, from the names of modules by
using "\module" and "\function"? Next, we could start by converting all
the strings of the form "\foo{" to "@foo{", which accomplishes a
remarkable amount of the work. Then we'd need to replace various common
phrases like
@renewcommand{@indexsubitem}{( in module )}
@begin{}
...
@end{}
with an appropriate TIM construct:
@def{tp,fn,exc...}
@{tt,et,vt...}index . ( in module )
...
@end def{tp,fn,exc...}
Most of this can be accomplished with a few Emacs macros.
> One final note: I looked at Perl's POD (Plain Old Documentation) for a
> few seconds. It's more limited than TIM and uses physical markup
> (e.g. B), but has one feature that I like: a block of
> indented text offset by blank lines (I believe) is automatically
> interpreted as a code sample block (verbatim in LaTeX terms,
> @codeexample in TIM). This makes POD source remarkably readable. I
> presume that it would be trivial to add this to the TIM front-end. (I
> particularly like this idea because it's the same convention that I
> used in the Python FAQ wizard. :-)
I'll put it on my list.
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From janssen@parc.xerox.com Mon Nov 17 21:56:53 1997
From: janssen@parc.xerox.com (Bill Janssen)
Date: Mon, 17 Nov 1997 13:56:53 PST
Subject: [DOC-SIG] Library reference manual debate
In-Reply-To: <199711161854.NAA21167@eric.CNRI.Reston.Va.US>
References: <9711161700.AA31749@arnold.image.ivab.se>
<199711161854.NAA21167@eric.CNRI.Reston.Va.US>
Message-ID:
Excerpts from ext.python: 16-Nov-97 Re: [DOC-SIG] Library refer.. Guido
van Rossum@CNRI.Re (3835)
> Note that one feature I like is that the LaTeX {funcdesc} environment
> automatically creates the index entry for the function; it combines it
> with some information provided earlier in the file:
> \renewcommand{\indexsubitem}{(in module re)}
> I see that this is done manually in TIM (although I'm not sure why).
I didn't like the automatically-generated index terms (just "ORB_init"
-- I wanted "CORBA.ORB_init (Python LSR function)") that the default
rules for "deffn" provided, so I decide to explicitly specify what I
wanted in the index. Perhaps the right way to fix this would have been
to extend the TIM front-end to understand a "index context string",
which would have been added to index entries automagically, but to give
us complete control over the entries I decided to use a separate line.
Bill
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Tue Nov 18 11:46:24 1997
From: papresco@technologist.com (Paul Prescod)
Date: Tue, 18 Nov 1997 06:46:24 -0500
Subject: [DOC-SIG] Library reference SGML plan
References: <9711162009.AA20858@arnold.image.ivab.se>
Message-ID: <34718010.FCDB7B5F@technologist.com>
Before I start -- how is all of this going to play out with 1.5 and
updates and so forth. Is now the right time to do a documentation
changeover?
If so, let me propose a reorganization of Fredrik's steps:
1. Hack the conversion from TeX to PyLibRef-SGML. Don't worry about
DocBook yet. We'll see how far PyLibRefSGML is from DocBook once we've
got an SGML document. (I can do this, but not for a few days)
2. Write the DTD (or at least a first draft) based on our observations
from 1. Work towards DocBook compatibility if possible because of the
benefits of standardization and code reuse. (I can do this after I
finish step 1).
3. Do the conversions to Print and HTML for our "demo".
4. Write a "howto" document on the entire system.
As for these:
> 5. write an SGML to XML converter using Grail's SGMLparser
> (in the meantime, we can use James Clark's "sx" tool)
> 6. write an XML parser (at least a tokenizer) that some day
> could be included in the standard Python distribution (almost
> done!)
I don't know that we need these two steps. I like XML, but it doesn't
seem relevant to the task at hand. Also, why don't we have a single
parser for sgml/xml? Then we're agnostic. (I'm defining SGML here as XML
plus a few minimizations).
> 7. write an XML to HTML tool based on (6) and a "Python style
> sheet" (almost done!)
This step makes a lot of sense, but I would say that we could be
SGML/XML agnostic here too.
> 8. write an XML to PostScript tool based on (6), the printer
> formatter from edroff, and PIL's PSDraw (or maybe we could
> use html2ps?)
This is where I get worried. You'll have to tell me more about edroff to
convince me that we're not taking on a humungous job here. I can't find
anything about it on the Web! Even if edroff is the easist tool in the
world to connect to, we should note that PostScript is not necessarily
the best print delivery format in the world. I much prefer to receive a
PDF or RTF file. With Jade, it seems we can deliver any of these -- PS,
PDF, RTF or TeX (not to mention MIF and whatever else someone adds
next). I'm reluctant to throw away the amazing job James has done
unifying those output formats so that support for any of them gives you
support for all of them. We would also be wasting the effort the
typesetting/wordprocessing tool vendors have put into line breaking
algorithms, etc.
I'm not convinced that we should get hung up about the entire printing
process being in Python. If anyone doesn't want to install Jade, then
they can just view the HTML output for proofing purposes. Why should
every author's desktop be able to serve as a publishing hub?
Also, I would rather spend effort integrating Jade and Python to make a
more powerful, flexible publishing solution than currently exists,
rather than replacing Jade with a less powerful copycat written in a
hurry.
> Or maybe fuse 5 and 6. But dealing with XML is much easier;
> an XML parser written in C could be added to Python without
> anyone noticing...
So could an SGML parser that does XML plus a few minimization
conventions.
Paul Prescod
_______________
DOC-SIG - SIG for the Python Documentation Project
send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________
From papresco@technologist.com Tue Nov 18 11:45:27 1997
From: papresco@technologist.com (Paul Prescod)
Date: Tue, 18 Nov 1997 06:45:27 -0500
Subject: [DOC-SIG] What I don't like about SGML
References: <199711161554.KAA20930@eric.CNRI.Reston.Va.US>
Message-ID: <34717FD7.386B724E@technologist.com>
Guido van Rossum wrote:
>
> First, while SGML may have been standardized in the swinging '80s, it
> definitely has its roots in the '70s -- it takes many years to become
> an international standard (look at C++!), and it started its life, as
> "GML", long before standardization started. Undoubtedly some of the
> worse features in SGML were designed to be backwards compatible
> (again, very much like C++...).
I don't doubt that SGML has some backwards compatible features, but it
is *not* backwards compatible with GML. The backwards compatibility
features mostly exist for people who think that something like TIM is
the greatest thing in the world and want to remake SGML in its image.
Anyhow TeX, and thus TeXInfo and thus TIM also have their "roots in the
70s." Big deal. As far as I'm concerned, Python has its roots in the 70s
too.
> 99.9% of the time, HTML is parsed by relatively simple handwritten
> parsers, not by generic SGML scanners. There are lots of programs out
> there that have to parse HTML -- preprocessors, web browsers, web
> spiders, etc. Why don't these just link to an existing SGML scanner?
> Because SGML scanners are *huge*. They need to be big to scan generic
> SGML, which is a very complex language. But most of this power isn't
> needed to scan HTML, so people roll their own parser.
That's true. That's why we should stick to an SGML subset. I propose XML
+ minimizations.
> But Berners-Lee made one mistake: he made HTML look a bit like SGML
> (which he had seen once or twice from a distance :-).
Berners-Lee's only mistake is that he didn't research SGML enough before
making HTML so that he had a lot of trouble bringing it back into the
SGML fold later.
> Almost
> immediately HTML was targeted by the SGML lobby for full compliance.
This is not true. Dan Connolly was the first person to propose an SGML
DTD for HTML. He is hardly in the "SGML Lobby" (talk to him about it
sometime, he has plenty of complaints about SGML) and the SGMLization of
HTML happened long before the SGML lobby really even understood the web.
Tim *hired* Dan to work with W3C and complete the work. In other words,
SGML was always Tim's idea. It goes back at least as far as 1993.
http://www.w3.org/MarkUp/draft-ietf-iiir-html-01.txt
I don't know about you, but I don't recall there being much a web to
"lobby" in 1993. Face it, Tim and Dan thought SGML was neat and they
implemented it. They have had a love/hate relationship ever since (as do
many people) but they have been moving towards SGML at every step (cf.
XML).
> Here's what was added; all of this made my parser much more
> complicated than I think it ought to be (look at how complicated
> sgmllib.py is). Note that most of what was added doesn't add
> functionality. In one or two cases it even takes away functionality!
I feel that there is an important point you are missing. SGML offers
lots of extra functionality beyond what HTML takes advantage of. If the
browser vendors (esp. Netscape) had not been explicitly SGML-hostile
(sound familiar?), the web would be much further ahead. But they have
fought tooth and nail to keep the useful features out.
> It just complicates the scanning process in order to be compatible
> with the extremely complicated scanning rules designed for SGML on
> punched cards in the 70s.
I don't know where you get this "punch card" stuff. GML was invented at
about the same time as C and UNIX, and after Simula 67. Goldfarb
invented it to be part of an *interactive document database system*.
Anyhow, this 70s/90s thing is only interesting if we've learned alot
about markup in the intervening 20 years. This doesn't seem to be the
case. TeXInfo, HTML and TIM really didn't introduce anything special
that SGML lacks. It seems the only thing we have learned since the
standardization of SGML is that some of its features are not as
important as we thought they would be. Fair enough -- lets not use them.
> - A second special character '&' for entity references (original HTML
> used to escape "<").
Big deal. Different markup for different things. Entity references can
go in attribute values and element content. They are NOT structural
sub-elements and should not be confused with them.
> - Character references like or SPACE;.
How else are you going to include a Unicode character by number or name?
Are you going to claim that this isn't an "increase in functionality?"
If you need to input a greek character you might disagree.
> - Comments in the form of , truly the most atrocious
> comment convention invented (and I believe it's worse -- officially,
> "--" may not occur inside a comment but "-- --" may, or something like
> that; but who cares, as almost no handwritten parser seems to get this
> right).
Comments could be simpler and smaller, but it really doesn't seem like a
big deal to me.
> - Special stuff to be ignored, starting with , where it is
> tricky to determine what the end is (since sometimes "<" or ">" may
> occur inside.
"<" or ">" can only occur inside *in quotes*. This is like complaining
that the following Python statement is confusing because of the two
colons:
if a=="j:b":
Big deal -- string literal context is different from program context (or
markup context, in SGML).
> - Special stuff to be ignored, starting with ...>.
What's so hard or complicated about that?
> - Short tags, compatibility reasons with older HTML processors, but which have to be
> recognized if you want to clame the elusive "full compliance".
Obviously sgmllib.py will never have full SGML compliance. Presumably
the reason you implemented those short cuts is actually because they are
useful and convenient.
I feel that your negative feelings about a particular process have
spilled over onto SGML. If the browser vendors had done their job
correctly in the first place, these short cuts would be allowed, would
always have been allowed, and would be usable today. You can hardly
blame their SGML-noncompliance on SGML! I might as well blame a
particular Unixes posix incompatibilities on Unix!
> - It is not possible to turn off processing completely. There used to
> be an HTML tag (?) which switched to literal copying of the
> text until was found. This is impossible to do in SGML --
> the best you can do is to switch to literal mode until followed by
That is not true. The *DTD* cannot turn off processing completely as
with the LISTING tag. The *author* can turn off processing completely
with a marked section:
>>>><<<<<>>>>>&&&&&&
]]>
The end of the marked section is indicated by "]]>." But this is going
to be VERY rarely required in Python documentation. The only Python code
that has a in it is code talking explicitly about SGML. So once in
every 30 listings, you'll have to use the syntax above. Note that this
syntax is one of the things that the HTML browsers have neglected to
implement, although it is VERY important as you point out. Don't blame
SGML, blame them.
> a letter is seen, and you can't turn off &ref; processing either.
That isn't true. You can turn that type of processing off using either a
CDATA content element or a CDATA marked section.
> - Why do I have to put quotes around the URL in HREF="http://www.python.org"> ???
Attribute values are string literals, just like in Python. You put them
in quotes to differentiate them from the surrounding whitespace, markup
delimiters, etc.
> - Other restrictions on what you can do with attributes; apparently
> there's a semantic rule that says that if two unrelated tags have an
> attribute with the same name, it must have the same "type".
That isn't true.
> - A content model, which nobody asked for, and which few people check
> for, but which still allows HTML purists to tell you that your HTML
> page is "non-conformant" when you place an heading inside a
> list item (okay, so I made that up).
I must admit, I'm shocked to hear you say that. It was exactly *for* the
content model that Tim Berners-Lee and Dan Connolly moved HTML to be an
SGML document type. Please tell me what Grail should do with this
document:
Here's a rather STRANGE HTMLish DOCUMENT
This is a title
This is another title
This is a third
Strange to have so many!
But without a content model
This is perfectly legal
Here's a rather odd table
Curiouser and Curiouser
Without the concept of a content model, this is a perfectly legal
document, and Grail would have to handle it and do something reasonable
with it (what's the title of this document? what does the table
structure look like?) Without DTDs and content models, you have no basis
for an information system. The fact that HTML authors ignore SGML rules
is a sad commentary on the Web, not on HTML. Those who are building the
web today -- browser vendors and standardizers alike, have asked that
XML be extra strict because they recognize that the current HTML
situation is mess *in spite of* SGML's strictures (and *because of*
widespread SGML ignorance).
If you think it is reasonable to put H4s in LIs, then talk to Dan
Connolly. He can make it possible (in consultation with W3C members). If
you want to make it possible to put ANY element in ANY other element, he
could make that possible too. SGML can allow anything anywhere just like
TIM or LaTeX. But he wouldn't -- he knows that constraints on element
occurences are crucial. Removing them would be akin to asking Python
parsers to handle any random combination of operators and delimiters:
if ( def a(): class b(): pass )
> - Probably a few other things that nobody asked for, such as the
> DTD declaration and SGML's approach to character sets (which is
> probably broken -- I believe there is a way to switch character
> sets in mid-stream...).
The DTD is an important part of the documentation for HTML and also
important implementation tool for many vendors. I don't know what your
problem is with it.
I don't know that SGML's approach to character sets is broken. Could you
be more specific? And perhaps you could describe how TIM's "approach to
character sets" is superior.
> So my claim remains that the requirement of SGML conformance is for
> 99% just a nuisance for parser writers. Of course I'm biased, since
> I'm a parser writer myself... So see for yourself what you think of
> this argument.
Of course compliance with any standard is a nuisance. It is always
easier to hack up what you need as you go along. Because of powerful
anti-SGML politics, HTML never took advantage of much of SGML's power.
For instance one of SGML's most basic facilities is the ability to reuse
content in the same document or across documents. But HTML can't do it.
Blame the browser vendors.
Most of the points in your flame seem to me, to be more of an indictment
of anti-SGML bias than of SGML itself. It is as if someone tried out the
famed Posix compatibility mode in NT and then claimed that Unix was
broken based on it. Obviously that environment is not a true reflection
of Unix itself, because its creators were not trying to allow access to
the power of Unix. HTML was supposed to allow access to the power of
SGML, but then Marc took over the web and forward progress ground to a
halt in favour of