splitting an XML file on the basis on basis of XML tags

Diez B. Roggisch deets at nospam.web.de
Thu Apr 3 14:28:33 EDT 2008


> I abuse it because I can (and because I don't generally work with XML
> files larger than 20-30meg) :)
> And the OP never said the XML file for 1TB in size, which makes things
> different.

Even with small xml-files your advice was not very sound. Yes, it's 
tempting to use regexes to process xml. But usually one falls flat on 
his face soon - because of whitespace or attribute order or <foo></foo> 
versus <foo/> or .. or .. or.

Use an XML-parser. That's what they are for. And especially with the 
pythonic ones like element-tree (and the compatible lxml), its even more 
straight-forward than using rexes.


Diez



More information about the Python-list mailing list