[XML-SIG] How to get 4DOM to output empty <elements />

Jeremy J. Sydik jsydik@virtualparadigm.com
Mon, 08 May 2000 20:42:59 -0500


The entire Reference is:

XHTML 1.0: The Extensible HyperText Markup Language
            A Reformulation of HTML 4 in XML 1.0
            W3C Recommendation 26 January 2000
   .
   .
   Appendix C. HTML Compatibility Guidelines
      .
      .
      C.2 Empty Elements
         Include a space before the trailing / and > of empty elements,
e.g. <br />, <hr /> 
         and <img src="karen.jpg" alt="Karen" />. Also, use the
minimized tag syntax for 
         empty elements, e.g. <br />, as the alternative syntax
<br></br> allowed by XML
         gives uncertain results in many existing user agents.

As I'm reading this, the point is EXACTLY that we're working with
non-aware agents, in
particular, those browsers not capable of handling XML (So, really most
of the current
market last I knew).  As far as the original question, output of
<br></br> IS valid XHMTL,
but not compatible with the current HTML browser base for the most part,
hence <br />. 

That aside, I think this might work as a workaround for you until an
answer from the FT
crew shows up:

Make the Following Changes to Ft/Dom/Ext/PrettyPrintVisitor:

Change __init__ to be: 
   def __init__(self, indent, width, plainElements,singleElements=[]):
       self.__indent = indent
       self.__depth = 0
       self.__width = width
       self.__plainElements = plainElements
       self.__singleElements = singleElements
       self.__printPlain = 0
       self.__plainPrinter = PrintVisitor()
       self.__prevNodeIsText = 0
       self.__emptyReturn = 0
       self.__namespaces = [{}]

In visitElement:
   Replace:
      st = string.rstrip(st) + '>'
   With:
      if node.tagName in self.__singleElements:
          st=string.rstrip(st) + ' />'
      else:
          st = string.rstrip(st) + '>'

   Replace:
      if node.ownerDocument.isXml() or node.hasChildNodes() or
node.tagName not in HTML_SINGLE_TAGS:
   With:
      if node.ownerDocument.isXml() or node.hasChildNodes() or
node.tagName not in HTML_SINGLE_TAGS or node.tagName not in
self.__singleElements:

At which time your code example would look like:
   p = PrettyPrintVisitor("  ",80,[""],["IMG"])
   open("books.html","w").write(p.visit(newdoc))


Not seeing your full code example, I don't know if this will actually
work or not.

Have Fun,
   Jeremy



Norman Walsh wrote:
> 
> / Michael Hudson <mwh21@cam.ac.uk> was heard to say:
> | But it also says (in the "informative" appendix C):
> |
> |     Also, use the minimized tag syntax for empty elements, e.g. <br
> |     />, as the alternative syntax <br></br> allowed by XML gives
> |     uncertain results in many existing user agents.
> 
> Broken (or not properly XML-aware) user agents.
> 
>                                         Be seeing you,
>                                           norm
> 
> --
> Norman Walsh <ndw@nwalsh.com> | Blessed is he who expects nothing, for
> http://nwalsh.com/            | he shall never be disappointed.--Pope
> 
> _______________________________________________
> XML-SIG maillist  -  XML-SIG@python.org
> http://www.python.org/mailman/listinfo/xml-sig