Can xml.sax NOT process the DTD?

marek jedlinski marekjed at pobox.INVALID.com
Mon Jan 28 06:48:57 EST 2008


I'm using xml.sax to extract certain content from xml files. (Background:
my job is software localization; these are bilingual xml files, from which
I need to extract translated text, e.g. for spellchecking). 

It works fine, unless a particular file has a doctype directive that
specifies a DTD. The parser then bails out, because the dtd is not
available (IOError, file not found). Since I don't have the DTDs, I need to
tell the SAX parser to ignore the doctype directive. Is this possible,
please?

I've noticed that I can eliminate the error if I create 0-byte dtd files
and put them where the parser expects to find them, but this is a little
tedious, since there are plenty of different DTDs expected at different
locations.

Or is there another SAX parser for Python I could use instead? 

Kind thanks for any suggestions,
.marek



--
No ads, no nags freeware: http://www.tranglos.com



More information about the Python-list mailing list