XML using standard Python modules
Alexey Marinichev
lyosha at lyosha.2y.net
Thu Sep 13 17:53:54 EDT 2001
In article <k2nuptg6i18n0l48sg2llnl5kho3hqp4g3 at 4ax.com>,
Dale Strickland-Clark <dale at riverhall.NOSPAM.co.uk> wrote:
>I'm trying to get to grips with XML using Python.
>
>A simple app to start with, it will read a plain text file containing
>some data, convert it to XML and write an XML file.
>
>Later that file will be used as a random-access data source.
>
>Where do I start?
>
>I'm reading this: http://py-howto.sourceforge.net/xml-howto/SAX.html
>at the moment. Is it up-to-date?
>
>There seems to be half a dozen XML modules. Which is the right one for
>this type of application? XML.SAX?
>
>Thanks for any pointers.
Minidom is pretty straightforward:
>>> from xml.dom.minidom import Document
>>> d = Document()
>>> e1 = d.createElement("foo")
>>> e1.attributes["attr"] = "value"
>>> print e1.toxml()
<foo attr="value"/>
>>> e2 = d.createElement("bar")
>>> e21 = d.createTextNode("Hello, world!")
>>> e2.appendChild(e21)
<DOM Text node "Hello, wor...">
>>> print e2.toxml()
<bar>Hello, world!</bar>
>>> e = d.createElement("main")
>>> d.appendChild(e)
<DOM Element: main at 135810908>
>>> print d.toxml()
<?xml version="1.0" ?>
<main/>
>>> e.appendChild(e1)
<DOM Element: foo at 135830820>
>>> e.appendChild(e2)
<DOM Element: bar at 135838812>
>>> print d.toxml()
<?xml version="1.0" ?>
<main><foo attr="value"/><bar>Hello, world!</bar></main>
>>> e.appendChild(e1.cloneNode(1))
<DOM Element: foo at 135607220>
>>> e.appendChild(e1.cloneNode(1))
<DOM Element: foo at 135837932>
>>> e.appendChild(e1.cloneNode(1))
<DOM Element: foo at 135841092>
>>> e.appendChild(e2.cloneNode(1))
<DOM Element: bar at 135841796>
>>> e.appendChild(e1.cloneNode(1))
<DOM Element: foo at 135913204>
>>> e.appendChild(e1.cloneNode(1))
<DOM Element: foo at 135914692>
>>> e.appendChild(e1.cloneNode(1))
<DOM Element: foo at 135606788>
>>> print e.toxml()
<main><foo attr="value"/><bar>Hello, world!</bar><foo attr="value"/><foo attr="value"/><foo attr="value"/><bar>Hello, world!</bar><foo attr="value"/><foo attr="value"/><foo attr="value"/></main>
>>> print e.toprettyxml(indent=" ")
<main>
<foo attr="value"/>
<bar>
Hello, world!
</bar>
<foo attr="value"/>
<foo attr="value"/>
<foo attr="value"/>
<bar>
Hello, world!
</bar>
<foo attr="value"/>
<foo attr="value"/>
<foo attr="value"/>
</main>
>>>
I meant to do d.toprettyxml in the last command... Anyhow, you get the idea.
cloneNode(1) means "deep" clone, that is, with all subelements.
Here's some more:
>>> bars = e.getElementsByTagName("bar")
>>> bars
[<DOM Element: bar at 135838812>, <DOM Element: bar at 135841796>]
>>> bar = bars[1]
>>> text = bar.firstChild
>>> text.data
'Hello, world!'
>>> text.data = "Good bye."
>>> print d.toprettyxml(indent=" ")
<?xml version="1.0" ?>
<main>
<foo attr="value"/>
<bar>
Hello, world!
</bar>
<foo attr="value"/>
<foo attr="value"/>
<foo attr="value"/>
<bar>
Good bye.
</bar>
<foo attr="value"/>
<foo attr="value"/>
<foo attr="value"/>
</main>
>>>
To make a document out of an XML string, use "parseString"; to parse a file use
"parse".
Xerces API javadoc is where I learned this stuff. It is slightly different in
Python, but it's still very close.
--Lyosha
More information about the Python-list
mailing list