[XML-SIG] problems using XML-sig code to read large XML files.
Anthony Baxter
anthony@interlink.com.au
Mon, 15 May 2000 13:04:08 +1000
I'm using the XML-sig code to read in a largish (2.5M) XML document.
This document consists of a very very simple structure, like this:
<locations>
<country name='Aruba' ccode='ABW'>
<Name name='Aruba' canon='1' />
<place geokey='1721834' name='Barcadera'>
<Name name='Barcadera' canon='1' />
</place>
<place geokey='16838' name='Druif'>
<Name name='Druif' canon='1' />
</place>
<place geokey='7761' name='Oranjestad'>
<Name name='Oranjestad' canon='1' />
</place>
<place geokey='77050' name='Sint Nicolaas'>
<Name name='Sint Nicolaas' canon='1' />
</place>
</country>
[..more countries..]
</locations>
this was generated using the xml-sig code. However, when I try to read
it in using something like:
from xml.dom import utils
reader = utils.FileReader('out.xml')
doc = reader.document
I get an error:
File "read.py", line 2, in ?
reader = utils.FileReader('out.xml')
File "/opt/python/lib/python1.5/site-packages/xml/dom/utils.py", line 131, in __init__
self.document = self.readFile(filename)
File "/opt/python/lib/python1.5/site-packages/xml/dom/utils.py", line 140, in readFile
document = self.readStream(file,type)
File "/opt/python/lib/python1.5/site-packages/xml/dom/utils.py", line 148, in readStream
document = self.readXml(stream)
File "/opt/python/lib/python1.5/site-packages/xml/dom/utils.py", line 165, in readXml
p.feed(stream.read())
File "/opt/python/lib/python1.5/site-packages/xml/sax/drivers/drv_pyexpat.py", line 123, in feed
if not self.parser.Parse(data):
pyexpat.error: not well-formed: line 37162, column 19
Using the other example on http://www.python.org/doc/howto/xml/node12.html
I get something like
Traceback (innermost last):
File "read.py", line 16, in ?
p.close()
File "/opt/python/lib/python1.5/site-packages/xml/sax/drivers/drv_pyexpat.py", line 127, in close
if not self.parser.Parse("",1):
pyexpat.error: no element found: line 16148, column 16
Running both of them repeatedly gives different positions in the file.
None of the lines mentioned in the file have a problem. Zope with the Ft
ZDOM or the normal Zope DOM code have no problems with it. nsgmls has no
problem with it.
I've tried both the 0.5.4 and current CVS versions, to no avail.
The dom_from_xml_file.py demo in Ft.Dom.demo also breaks.
I can make the file available if anyone wants it, although just
taking the example above and making 10,000 copies of the country
into a file will do the trick.
anyone?
thanks,
Anthony