[Doc-SIG] reStructuredText-to-HTML XSLT

Tony J Ibbs (Tibs) tony@lsl.co.uk
Tue, 16 Oct 2001 10:07:39 +0100


> Here's some XSLT to translate reStructuredText's XML output to HTML.

I'm afraid I'll have to leave it to someone else to evaluate that (but
such stuff is intrinsically neat).

> I'm not sure about the desired behavior for text which
> contains characters which have to be escaped in the XML.
> Leave them escaped, or unescape them so the user can
> intersperse HTML markup with rST markup?  I chose the worst
> of both worlds by co-opting the "interpreted" tag to mean
> "contains HTML markup, should not be escaped when output
> to HTML".  Ideas welcome.

One side
--------
I vote very strongly against doing anything other than quoting any
"HTML" (subject to convincing argument otherwise, of course - and see
below_).

One of the serious problems with ST (and STNG) was that they allowed the
user to intersperse HTML (actually, SGML) style tags within a document,
on the assumption that these would be "passed through" to the final
result, and have some predictable result. This *only* works[1]_ if there
is a browser that understands the relevant tags at the far end (so, in
general, ties one to HTML as an output format), and *also* potentially
allows the user to "subvert" the workings of the documentation language.

.. [1] Yes, I know in theory someone might be interpreting HTML,
   but that's fairly unlikely, and I chooose to term such an
   animal a "browser" for this purpose.

The former is a problem if, for instance, one is wanting to represent
the document as plain text, TeX, PDF, etc., etc. One does not want to
have to write an HTML/XML/SGML understander to cope with random tags.

The second is a well known problem in the LaTeX world (for instance) -
using a package that is based on a powerful language (TeX) and still
allows access to said language means that some (awkward) people will
(attempt to) do "clever" things using said language - and this can cause
things to go disastrously and obscurely wrong.

The *sole* case I might give in on would be if the user had access to an
appropriate role - for instance::

    :html:`<hr>`
    :xml:`<myfavouritetag andits="data" />`

since there we are making them be *terribly* specific. But I don't think
the definition of roles has been covered yet?

.. below:

And the other
-------------
However. We have discussed *modes* for DPS/reST (was that the term?),
where obviously "Python" is one such, and "book" might be another. I
have suggested in the past that maybe "HTML" would be useful, so that we
can allow preparation of HTML pages in a simple manner. In that case, it
*might* be argued that one *is* only aiming at HTML as output, and thus
might want to allow "subversion" of reST to do particular things (in
particular, horizontal rules are very useful). So maybe in that one case
one might want to relax the rules to allow "interpreted" text to work as
you do. However, I think one would probably want a directive in the
document to state this (David - is that right?) so that people would
know that this document was not a "general" document, but targetted at a
specific output form.

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Give a pedant an inch and they'll take 25.4mm
(once they've established you're talking a post-1959 inch, of course)
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)