elementtree question

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Mon Sep 24 03:43:15 EDT 2007


En Sat, 22 Sep 2007 00:19:59 -0300, Mark T <nospam at nospam.com> escribi�:

> "Gabriel Genellina" <gagsl-py2 at yahoo.com.ar> wrote in message
> news:mailman.898.1190412966.2658.python-list at python.org...
>> En Fri, 21 Sep 2007 11:49:53 -0300, Tim Arnold <tim.arnold at sas.com>
>> escribi�:
>>
>>> Hi, I'm using elementtree and elementtidy to work with some HTML files.
>>> For
>>> some of these files I need to enclose the body content in a new div  
>>> tag,
>>> like this:
>>> <body>
>>>   <div class="remapped">
>>>    original contents...
>>>   </div>
>>> </body>

[wrong code]

> The above wraps the body element, not the contents of the body element.   
> I'm
> no ElementTree expert, but this seems to work:

[better code]

Almost right. clear() removes all attributes too, so if the body element  
had any attribute, it is lost. I would remove children from body at the  
same time they're copied into newdiv.
(This whole thing appears to be harder than one would expect at first)

import xml.etree.cElementTree as ET
source = """<html><head><title>Test</title></head><body lang="en">
   original contents... 2&3 <a href="hello/world">some text</a>
   <p>Another paragraph</p>
</body></html>"""
tree = ET.XML(source)
body = tree.find("body")
newdiv = ET.Element('div', {'class':'remapped'})
for e in list(body.getchildren()):
   newdiv.append(e)
   body.remove(e)
newdiv.text, body.text = body.text, ''
newdiv.tail, body.tail = body.tail, ''
body.append(newdiv)
ET.dump(tree)

-- 
Gabriel Genellina




More information about the Python-list mailing list