XML parsing with python

Stefan Behnel stefan_ml at behnel.de
Tue Aug 18 02:24:24 EDT 2009


inder wrote:
> On Aug 17, 8:31 pm, John Posner <jjpos... at optimum.net> wrote:
>>> Use the iterparse() function of the xml.etree.ElementTree package.
>>> http://effbot.org/zone/element-iterparse.htm
>>> http://codespeak.net/lxml/parsing.html#iterparse-and-iterwalk
>>> Stefan
>> iterparse() is too big a hammer for this purpose, IMO. How about this:
>>
>>   from xml.etree.ElementTree import ElementTree
>>   tree = ElementTree(None, "myfile.xml")
>>   for elem in tree.findall('//book/title'):
>>       print elem.text
>>
>> -John
> 
> Thanks for the prompt reply .
> 
> I feel let me try using iterparse. Will it be slower compared to SAX
> parsing ... ultimately I will have a huge xml file to parse ?

If you use the cElementTree module, it may even be faster.


> Another question , I will also need to validate my xml against xsd . I
> would like to do this validation through the parsing tool  itself .

In that case, you can use lxml instead of ElementTree.

http://codespeak.net/lxml/

Stefan



More information about the Python-list mailing list