XML file parsing with SAX

Willem Ligtenberg WLigtenberg at gmail.com
Sat Apr 23 09:20:01 EDT 2005


I decided to use SAX to parse my xml file.
But the parser crashes on:
  File "/usr/lib/python2.3/site-packages/_xmlplus/sax/handler.py", line 38, in fatalError
    raise exception
xml.sax._exceptions.SAXParseException: NCBI_Entrezgene.dtd:8:0: error in processing external entity reference

This is caused by:
<!DOCTYPE Entrezgene-Set PUBLIC "-//NCBI//NCBI Entrezgene/EN"
"NCBI_Entrezgene.dtd">

If I remove it, it parses normally.
I've created my parser like this:
import sys
from xml.sax import make_parser
from handler import EntrezGeneHandler

fopen = open("mouse2.xml", "r")
ch = EntrezGeneHandler()
saxparser = make_parser()
saxparser.setContentHandler(ch)
saxparser.parse(fopen)

And the handler is:
from xml.sax import ContentHandler

class EntrezGeneHandler(ContentHandler):
	"""
	A handler to deal with EntrezGene in XML
	"""
	
	def startElement(self, name, attrs):
		print "Start element:", name

So it doesn't do much yet. And still it crashes...
How can I tell the parser not to look at the DOCTYPE declaration.
On a website:
http://www.devarticles.com/c/a/XML/Parsing-XML-with-SAX-and-Python/1/
it states that the SAX parsers are not validating, so this error shouldn't
even occur?

Cheers,

Willem



More information about the Python-list mailing list