[XML-SIG] Provide your own SAX parser to the DOM?

Chris Herborth chrish at cryptocard.com
Mon Dec 1 13:51:07 EST 2003


I've got PyXML 0.8.3 installed here, and I'm generating the DOM for some 
documents thusly:

reader = xml.dom.ext.reader.Sax2.Reader()

# snipped: setting up an external entity resolver and error handler

dom = reader.fromStream( file( an_xml_filename ) )

Is it possible to use a different SAX parser and still get the advantages of 
using the PyXML DOM goodness?  I'm thinking ahead to when I want to use a 
validating parser, although the xml.dom.ext.reader.Sax2.Reader() appears to 
already dig through my DTD...

The reason why I'm asking is because I'm using the resulting DOM to generate 
HTML 3.2 for JavaHelp.  My DTD uses XHTML 1.0 entities and, for the most 
part, I'd like to _not_ have the Sax2.Reader() translating the entities into 
their Unicode characters (I've referenced the XHTML 1.0 entities from my DTD)...

I want to be able to leave the entities in place and/or translate them into 
something myself.  For example, JavaHelp 2.0 implements (most of) the 
Latin-1 accented character entities, but almost none of the others, so I'll 
have to handle ™ (for example) "by hand".

-- 
Chris Herborth                                     chrish at cryptocard.com
Documentation Overlord, CRYPTOCard Corp.      http://www.cryptocard.com/
Never send a monster to do the work of an evil scientist.





More information about the XML-SIG mailing list