Problem inserting an element where I want it using lxml

Alan Meyer ameyer2 at yahoo.com
Wed Jan 5 00:57:37 EST 2011


I'm having some trouble inserting elements where I want them
using the lxml ElementTree (Python 2.6).  I presume I'm making
some wrong assumptions about how lxml works and I'm hoping
someone can clue me in.

I want to process an xml document as follows:

For every occurrence of a particular element, no matter where it
appears in the tree, I want to add a sibling to that element with
the same name and a different value.

Here's the smallest artificial example I've found so far
demonstrates the problem:

     <foo>
       <whatever>
        <something/>
       </whatever>
       <bingo>Add another bingo after this</bingo>
       <bar/>
     </foo>

What I'd like to produce is this:

     <foo>
       <whatever>
        <something/>
       </whatever>
       <bingo>Add another bingo after this</bingo>
       <bar/>
     </foo>

Here's my program:

-------- cut here -----
from lxml import etree as etree

xml = """<?xml version="1.0" ?>
<foo>
   <whatever>
    <something/>
   </whatever>
   <bingo>Add another bingo after this</bingo>
   <bar/>
</foo>
"""

tree = etree.fromstring(xml)

# A list of all "bingo" element objects in the unmodified original xml
# There's only one in this example
elems = tree.xpath("//bingo")

# For each one, insert a sibling after it
bingoCounter = 0
for elem in elems:
     parent = elem.getparent()
     subIter = parent.iter()
     pos = 0
     for subElem in subIter:
         # Is it one we want to create a sibling for?
         if subElem == elem:
             newElem = etree.Element("bingo")
             bingoCounter += 1
             newElem.text = "New bingo %d" % bingoCounter
             newElem.tail = "\n"
             parent.insert(pos, newElem)
             break
         pos += 1

newXml = etree.tostring(tree)
print("")
print(newXml)
-------- cut here -----

The output follows:

-------- output -----
<foo>
   <whatever>
    <something/>
   </whatever>
   <bingo>Add another bingo after this</bingo>
   <bar/>
<bingo>New bingo 1</bingo>
</foo>
-------- output -----

Setting aside the whitespace issues, the bug in the program shows
up in the positioning of the insertion.  I wanted and expected it
to appear immediately after the original "bingo" element,
and before the "bar" element, but it appeared after the "bar"
instead of before it.

Everything works if I take the "something" element out of the
original input document.  The new "bingo" appears before the
"bar".  But when I put it back in, the inserted bingo is out of
order.  Why should that be?  What am I misunderstanding?

Is there a more intelligent way to do what I'm trying to do?

Thanks.

     Alan



More information about the Python-list mailing list