[DOC-SIG] Comparing SGML DTDs

Paul Prescod papresco@technologist.com
Wed, 12 Nov 1997 00:51:00 -0500


If we do decide to move the tutorial from LaTeX to SGML, then we must
choose a DTD.

TEILite
=======
TEILite is a subset of the Text Encoding Initiative's TEI DTD. This DTD
is designed for academic work -- scholarly text analysis, but TEI Lite
strips most of that junk out and leaves something at the semantic level
of LaTeX but with real, software-enforced structure. It still has things
we don't need (of course) so we could still make a subset, but it is in
the ballpark:

http://www-tei.uic.edu/orgs/tei/intros/teiu5.html

It wasn't designed specifically for software manuals, but it has enough
to handle what I have seen of the tutorial (keyword, example, etc.)

http://www-tei.uic.edu/orgs/tei/intros/teiu5.html#TECHDOC

I have a lot of experience with TEILite and am writing a computer book
in a variant of it. The most interesting free software for dealing with
TEI are the tools for creating print and web documents with the Jade
SGML processing engine. There are also Perl-based translators, just in
case those happen to do something better.

TEILite was painstakingly designed by smart people and I think it is
quite good.

DocBook
=======
DocBook is large and powerful. I don't know if a smaller subset exists,
bue I'm looking into it. The most interesting free software for dealing
with DocBook are the tools for creating print and web documents with
Jade. 

http://www.berkshire.net/~norm/dsssl/

These are maintained by Norm Walsh formerlly of ORA now at SGML vendor
ArborText (SGML people shuffle around a lot). He is also writing
"DocBook in a Nutshell". Maybe DocBook is too complicated for a first
cut at SGML documentation (or even a subsequent cut). We don't need
things like callouts, procedure lists, sidebars, and so forth.

Anyhow, the DocBook tutorial is at http://www.oreilly.com/davenport/

LinuxDoc/SGML-Tools
===================
This DTD was called "LinuxDoc" and has been renamed "SGML Tools". My
concern about it is that the people who maintain the "SGML Tools"
software package are a bunch of Perl/C/Awk hackers, and if that wasn't
enough to make you worry, they ignore what I consider to be the coolest
tool for SGML processing ever invented -- Jade -- which is the fastest
way to turn SGML documents into beautiful print pages and web pages.
Jade isn't "manly enough" because it uses Scheme as an expression
language and everyone knows that people only use those kinds of
languages in research labs. Since I think a Jade-based approach is 10
times easier to maintain, I would probably not use any of their software
unless it did exactly what we wanted right out of the box. They seem to
spend most of their time chasing down problems that Jade would solve for
them.

ANYHOW, the SGML Tools DTD is much like LaTeX redone in SGML. You can
decide for yourself if that is a good or bad thing. I see you guys have
considered SGML Tools before, in a thread going in exactly the same
direction this February. :) Also, SGML-Tools has a WYSIYG editor in Lyx
-- again you can decide for yourself if that is good or bad.

One other problem these SGML Tools guy also have a VERY unix-focussed
outlook last I checked. I don't think our documentation system should
depend on anything more than Python and SP both of which run on Windows,
Unix and OS/2 (don't know about Mac).

http://www.sil.org/sgml/publicSW.html#linuxdoc

Conclusion
==========
Python wasn't built in a day. I think that TEILite is a nice, manageable
DTD that has all of the features we need. DocBook seems like it is
designed for what we want to do, but it looks like it is overkill for
today. Maybe it will be appropriate for the LibRef. Maybe it will still
be overkill.

>From TEILite we can immediately get:

 * TeX
 * FrameMaker MIF
 * Windows RTF
 * Postscript (from any of the above)
 * as new Jade back-ends are written, we get them "for free"

Those all come "for free" from a single Jade stylesheet. We can also get
HTML through a TEILite->HTML stylesheet that I am 70% of the way done
writing. (you can't get HTML for free because it is so different from
printed formats) I also have a Python parser for NSGMLS's output format
for when we want to do complex things with the docs.

Nobody has yet done the task of SWIGging (or ILUing) SP which would
allow us higher performance access to its internal data. On Windows, you
can do all of that magic through OLE, but we obviously can't depend on
that.

I think we are further ahead than we were last February. There are at
least two hard-core SGML users in the group who can help to customize
the DTD and improve converters should we need them, more Python software
for dealing with SGML and Jade/nsgmls have improved too.

Anyhow, I think the next step is to gather consensus on an SGML-based
plan. After that, we would all install Jade (which includes "nsgmls")
and perhaps the SGML extension for Emacs and start to convert the
tutorial to TEILite.

 Paul Prescod

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________