[Tutor] Injecting Data into XML Files

William O'Higgins Witteman hmm at woolgathering.cx
Mon Sep 11 20:18:17 CEST 2006


On Mon, Sep 11, 2006 at 09:57:28AM -0700, Dave Kuhlman wrote:
>On Mon, Sep 11, 2006 at 12:11:37PM -0400, William O'Higgins Witteman wrote:
>> I have a large number of XML documents to add data to.  They are
>> currently skeletal documents, looking like this:
>> 
>> <?xml version="1.0" ?>
>> <!DOCTYPE rdf:RDF SYSTEM "local.dtd">
>> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
>>    <rdf:Description rdf:about="local_file">
>>       <tagname></tagname>
>>       <anothertagname></anothertagname>
>>       ...
>> 
>> What I want is to open each document and inject some data between
>> specific sets of tags.  I've been able to parse these documents, but I am
>> not seeing how to inject data between tags so I can write it back to the
>> file.  Any pointers are appreciated.  Thanks.

>*How* did you parse your XML document?  If you parsed it and
>produced a minidom tree or, better yet, an ElementTree tree,
>you can modify the DOM tree, and then you can write that tree out
>to disk.

I have tried the common XML modules - minidom, sax and ElementTree.
There are clear, easy-to-follow examples of parsing for each one.

>Here is a bit of code to give you the idea with ElementTree (or
>lxml, which uses the same API as ElementTree):
>
>    from elementtree import ElementTree as etree
>    doc = etree.parse('content.xml')
>    root = doc.getroot()
>    # Do something with the DOM tree here.
>        o

This is the bit I'm missing - I can't seem to find an existing element
and change it's value.  When I do so I just get an additional element.
Here's the code I'm using:

main = etree.SubElement(root,"rdf:Description")
title = etree.SubElement(main,"title")
title.text = "Example Title"

>        o
>    # Now write the tree back to disk.
>    f = open('tmp.xml', 'w')
>    doc.write(f)
>    f.close()
>
>Here is info on ElementTree -- Scroll down and look at the example
>in the section titled "Usage", which seems to do something very
>similar to what you ask about:
>
>    http://effbot.org/zone/element-index.htm

This is, I suspect, a fine module, but the documentation you mention is
not helpful to me.  Specifically, in the above-mentioned section, it
reads like this:

# if you need the root element, use getroot
root = tree.getroot()

# ...manipulate tree...

What I need is an example or a clear description of what they mean when
they write "manipulate tree".

My problem is not "which tool to use?" but "how does it work?".  Thanks
for the help thusfar - one last push would be greatly appreciated.
Thanks again.
-- 

yours,

William


More information about the Tutor mailing list