Iterparse and ElementTree confusion
Fredrik Lundh
fredrik at pythonware.com
Wed Aug 17 08:06:41 EDT 2005
paul.sherwood at gmail.com wrote:
> def parse_for_products(filename):
>
> for event, elem in iterparse(filename):
> if elem.tag == "Products":
> root = ElementTree(elem)
> print_all(root)
> else:
> elem.clear()
>
> My problem is that if i pass the 'elem' found by iterparse then try to
> print all attributes, children and tail text i only get
> elem.tag....elem.keys returns nothing as do all of the other previously
> useful elementtree methods.
>
> Am i right in thinking that you can pass an element into ElementTree?
> How might i manually iterate through <product>...</product> grabbing
> everything?
by default, iterparse only returns "end" events, which means that the
iterator will visit the Products children before you see the Products
element itself. with the code above, this means that the children will
be nuked before you get around to process the parent.
depending on how much rubbish you have in the file, you can do
for event, elem in iterparse(filename):
if elem.tag == "Products":
process(elem)
elem.clear()
or
for event, elem in iterparse(filename):
if elem.tag == "Products":
process(elem)
elem.clear()
elif elem.tag in ("Rubbish1", "Rubbish2"):
elem.clear()
or
inside = False
for event, elem in iterparse(filename, events=("start", "end")):
if event == "start":
# we've seen the start tag for this element, but not
# necessarily the end tag
if elem.tag == "Products":
inside = True
else:
# we've seen the end tag
if elem.tag == "Products":
process(elem)
elem.clear()
inside = False
elif not inside:
elem.clear()
for more info, see
http://effbot.org/zone/element-iterparse.htm
</F>
More information about the Python-list
mailing list