[DOC-SIG] Library reference manual debate

Guido van Rossum guido@CNRI.Reston.Va.US
Sun, 16 Nov 1997 11:27:43 -0500


Paul Prescod:
> My biggest concern would be that these tool incompatibilities (or
> partial compatibilitites) would be construed as "extra SGML
> complications" whereas TIM, having no real popularity at all, can be
> extended in an ad hoc manner and thus could be seen to be more
> "flexible" than SGML. By that argument, a language I invent tomorrow
> would be more "flexible" than Python because it has no installed base
> and thus I can change it to be whatever I want, but lose the support of
> a community and a set of existing tools. This "flexibility" leads to an
> infinite number of contrived, incompatible languages. So yes, I would
> rather byte the bullet and invent our own delimiter conventions within
> SGML rather than invent Yet Another Markup Language. 
> 
> But just be aware that it will probably cost us in tool compatibility at
> some point, and force us to do some extra transformations to a simpler
> SGML subset.

Okay, now we're talking.  The issue of layering tools is real.  I
expect that no matter which way we go, we will have to craft some
tools of our own.  I'm using latex now, and the tools I have crafted
so far are in myformat.sty.  In a sense, this is equivalent to a DTD
extension in SGML plus a style sheet.  When using TIM, the same thing
is done using a macro file.

Let me try to explain once more why I am hesitant to adopting SGML
(apart from my hang-ups about the lexer, which I discuss in a separate
thread -- they aren't particularly relevant).

I believe that part of Python's success lies in the fact that it has
few dependencies on other tools.  For example, it's written in C
rather than C++, and in fact until very recently I made sure that it
was compilable with a K&R C compiler as well as with a Standard C
compiler.  What's the advantage of C over C++?  When I started Python
as a mostly Unix tool, C++ compilers were still under heavy
development.  I expected that many prospective users of the language
would not have a compatible C++ compiler already installed on their
system, and I expected that having to find one that was compatible
with their hardware and O/S would be enough of a deterrent that they
would never use Python unless they were *very* motivated.  So I used a
lowest-common-denominator language, K&R C, which at the time came
bundled with every Unix version.  I suppose that in 1997 the
availability of C++ compilers is no longer a problem (for example on
the Windows and Mac platforms all C compilers are really C++
compilers) -- but my choice for C was definitely the right one until
recently.  A second reason was programmer availability -- again, until
recently, if I had been using C++, it would have been harder for Joe
Average to change a few lines in the Python source to fix a bug and
to send me the diffs.

I am worried that SGML tools are still in a state similar to that of
C++ eight years ago: they exist, but they don't come bundled with any
O/S, and it takes time to track down the right tools for your platform
and then to install them, and you may or may not be successful
depending on what other software you have available.  I'm kind of
worried too because the only tool that is used as an existence proof
(Jade) seems to be a one-person project.  And of course the XML tools
are still almost completely in the vaporware category.

It has been mentioned that TIM is in the same situation: it's not
widely known or used.  However, the one big difference is that all of
TIM consists of three scripts, one of which is already written in
Python (and the other ones could easily be rewritten in Python).  So
instead of adding a dependency on a external tools, as with the
adoption of SGML, I would become *independent* of external tools when
I were to adopt TIM.  (This is exactly the same reason why the Perl
people did their own, POD.)

I believe that using an adaptation of TIM, it will be possible to
generate HTML *without downloading any additional tools*.  I think
this is a huge win, as HTML is all that's needed to preview one's
changes to the manual.  To generate PostScript will still require TeX
(and LaTeX and texinfo), but since the existing solution also requires
that, things don't get worse, and of course those who have a need to
use SGML can contribute a translator from TIM to SGML (or, more
likely, to XML, once the XML vaporware solidifies into software).
(Besides, it seems that to get PostScript out of SGML one generally
*also* has to go through TeX, at least on Unix.)

Note that adoption of SGML doesn't mean that we *don't* have to craft
our own tools -- we'll have to come up with a set of definitions (a
DTD extension, is that the right term?) so we can conveniently format
manual entries for functions, classes and methods with default
argument values, keyword arguments, specify argument types, and so on
(which is what the myformat.sty macros are about -- it defines
convenient ways to enter the information about function and method
prototypes).  I expect that this particular effort will be about the
same, whether we're using TIM or SGML.

The difference will be that when using TIM, we're encouraging Python
hackers to extend our tool set, while when using SGML, we're
encouraging SGML hackers to extend our tool set.  I won't try to guess
which type of hacker is predominant in the world at large; but in the
Python community, I'd say there's no doubt :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________