nonstandard XML character entities?
Chuck Rhode
CRhode at LacusVeris.com
Sat Apr 14 10:04:45 EDT 2007
Martin v. Löwis wrote this on Sat, 14 Apr 2007 09:10:44 +0200. My
reply is below.
> Paul Rubin:
>> I'm new to xml mongering so forgive me if there's an obvious
>> well-known answer to this. It's not real obvious from the library
>> documentation I've looked at so far. Basically I have to munch of
>> a bunch of xml files which contain character entities like ú
>> which are apparently nonstandard.
-snip-
> In ElementTree, the XMLTreeBuilder has an attribute entity which is
> a dictionary used to map entity names in entity references to their
> definitions. Whether you can make the parser download the DTD
> itself, I don't know.
What he said....
Try this on your piano:
: import xml.etree.ElementTree # or elementtree.ElementTree prior to 2.5
: ElementTree = xml.etree.ElementTree
: import htmlentitydefs
: class XmlFile(ElementTree.ElementTree):
: def __init__(self, file=None, tag='global', **extra):
: ElementTree.ElementTree.__init__(self)
: parser = ElementTree.XMLTreeBuilder(
: target=ElementTree.TreeBuilder(Element))
: parser.entity = htmlentitydefs.entitydefs
: self.parse(source=file, parser=parser)
: return
It looks goofy as can be, but it works for me.
--
.. Chuck Rhode, Sheboygan, WI, USA
.. Weather: http://LacusVeris.com/WX
.. 32° — Wind Calm
More information about the Python-list
mailing list