[XML-SIG] lxml iterparse and comments
Stuart McGraw
smcg4191 at frii.com
Mon Mar 24 04:56:59 CET 2008
Hello,
I am probably mising something elementary (I am new
to both xml and lxml), but I am having problems figuring
out how to get comments when using lxml's iterparse().
When I parse xml with parse() and iterate though the
result, I get the comments. But when I try to do the
same thing (approximately I think) with iterparse,
I don't see any comments. See example code below.
(lxml-2.02, Python-2.5.1)
(I was using the standard Python ElementTree but my
understanding is that it doesn't save comments at all.
If that's wrong I would go back to using it).
The real file is ~50MB and has about 1M nodes under the
root so I have to use iterparse and I also have to process
comments, so I would really appreciate a clue about how
to do it. Thanks.
Example code:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
import lxml.etree as ET
from cStringIO import StringIO
# XML data...
#=============================================
xmltxt = \
'''<?xml version="1.0" encoding="UTF-8"?>
<!-- Rev 1.06
-->
<!DOCTYPE Test [
<!ELEMENT Test (entry*)>
<!-- -->
<!ELEMENT entry ANY>
<!-- Description of <entry> element.
-->
]>
<!-- File created: 2008-02-27 -->
<Test>
<!-- Chronosynclastic Infindibulum Listing -->
<entry>text 1</entry>
<!-- Deleted: A1500477 -->
<entry>text 2</entry>
</Test>'''
#=============================================
print 'Parse:\n------'
et = ET.parse( StringIO (xmltxt))
for elem in et.iter():
print elem
print '\nIterparse:\n----------'
xx = ET.iterparse( StringIO (xmltxt), ("start","end"))
for event, elem in iter(xx):
print event, elem
More information about the XML-SIG
mailing list