xml.minidom and user defined entities

Nick Craig-Wood nick at craig-wood.com
Thu Nov 10 01:29:55 EST 2005


Fredrik Lundh <fredrik at pythonware.com> wrote:
>  Nick Craig-Wood wrote:
> 
> > I'm using xml.minidom to parse some of our XML files.  Some of these
> > have entities like "°" in which aren't understood by xml.minidom.
> 
>  ° is not a standard entity in XML (see below).

No probably not...

> > These give this error.
> >
> >  xml.parsers.expat.ExpatError: undefined entity: line 12, column 1
> >
> > Does anyone know how to add entities when using xml.minidom?
> 
>  the document is supposed to contain the necessary entity declarations
>  as an inline DTD, or contain a reference to an external DTD.  (iirc, mini-
>  dom only supports inline DTDs, but that may have been fixed in recent
>  versions).

The document doesn't define the entitys either internally or
externally.  I don't fancy adding an inline definition either as there
are 100s of documents I need to process!

>  if you don't have a DTD, your document is broken (if so, and the set of
>  entities is known, you can use re.sub to fix replace unknown entities with
>  the corresponding characters before parsing.  let me know if you want
>  sample code).

I was kind of hoping I could poke my extra entities into some dict or
other in the guts of xml.minidom...

However the job demands quick and nasty rather than elegant so I'll go
for the regexp solution I think, as the list of entities is well
defined.

Thanks for your help

Nick
-- 
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick



More information about the Python-list mailing list