xml processing : too slow...
Fredrik Lundh
fredrik at pythonware.com
Wed Jul 24 14:11:27 EDT 2002
"Shagshag13"
> * check well formedness
> * process each line which are like this:
> <tag0><tag1> 1 2 </tag1><tag2 attr="value">3</tag2></tag0>
> -> to have:
> ['<tag0>', '<tag1>', '1', '2', '</tag1>', '<tag2 attr="value">', '3', '</tag2>', '</tag0>']
from xml.parsers import expat
parser = expat.ParserCreate(None, None)
import re
p = re.compile("<[^>]*>|\d+")
for line in file:
# check wellformedness
parser.Parse(line, 0)
# split into parts
print p.findall()
# check for trailing junk
parser.Parse("", 1)
</F>
More information about the Python-list
mailing list