Iterparse and ElementTree confusion

Fredrik Lundh fredrik at pythonware.com
Wed Aug 17 08:06:41 EDT 2005


paul.sherwood at gmail.com wrote:

> def parse_for_products(filename):
>
>    for event, elem in iterparse(filename):
>        if elem.tag == "Products":
>            root = ElementTree(elem)
>            print_all(root)
>        else:
>            elem.clear()
>
> My problem is that if i pass the 'elem' found by iterparse then try to
> print all attributes, children and tail text i only get
> elem.tag....elem.keys returns nothing as do all of the other previously
> useful elementtree methods.
>
> Am i right in thinking that you can pass an element into ElementTree?
> How might i manually iterate through <product>...</product> grabbing
> everything?

by default, iterparse only returns "end" events, which means that the
iterator will visit the Products children before you see the Products
element itself.  with the code above, this means that the children will
be nuked before you get around to process the parent.

depending on how much rubbish you have in the file, you can do

    for event, elem in iterparse(filename):
        if elem.tag == "Products":
            process(elem)
            elem.clear()

or

    for event, elem in iterparse(filename):
        if elem.tag == "Products":
            process(elem)
            elem.clear()
        elif elem.tag in ("Rubbish1", "Rubbish2"):
            elem.clear()

or

    inside = False
    for event, elem in iterparse(filename, events=("start", "end")):
        if event == "start":
            # we've seen the start tag for this element, but not
            # necessarily the end tag
            if elem.tag == "Products":
                inside = True
        else:
            # we've seen the end tag
            if elem.tag == "Products":
                process(elem)
                elem.clear()
                inside = False
            elif not inside:
                elem.clear()

for more info, see

    http://effbot.org/zone/element-iterparse.htm

</F> 






More information about the Python-list mailing list