cElementTree clear semantics
Igor V. Rafienko
igorr at ifi.uio.no
Sun Sep 25 15:57:45 EDT 2005
[ Fredrik Lundh ]
[ ... ]
> the iterparse/clear approach works best if your XML file has a
> record-like structure. if you have toplevel records with lots of
> schnappi records in them, iterate over the records and use find
> (etc) to locate the subrecords you're interested in: (...)
The problem is that the file looks like this:
<data>
<schnappi>
<color>green</color>
<friends>
<friend>
<id>Lama</id>
<color>white</color>
</friend>
<friend>
<id>mother schnappi</id>
<color>green</color>
</friend>
</friends>
<food>
<id>human</id>
<id>rabbit</id>
</food>
</schappi>
<schnappi>
<!-- something interesting -->
</schnappi>
<!-- 60,000 more schnappis -->
</data>
... and there is really nothing above <schnappi>. The "something
interesting" part consists of a variety of elements, and calling
findall for each of them although possible, would probably be
unpractical (say, distinguishing <friend>'s colors from <schnappi's>).
Conceptually I need a "XML subtree iterator", rather than an XML
element iterator. <schnappi>-elements are the ones having a complex
internal structure, and I'd like to be able to speak of my XML as a
sequence of Python objects representing <schnappi>s and their internal
structure.
[ ... ]
> (I've reorganized the code a bit to cut down on the operations. also
> note the "is" trick; iterparse returns the event strings you pass
> in, so comparing on object identities is safe)
Neat trick.
Thank you for your input,
ivr
--
"...but it's HDTV -- it's got a better resolution than the real world."
-- Fry, "When aliens attack"
More information about the Python-list
mailing list