DOM - some pointers
Andrew Dalke
dalke at dalkescientific.com
Tue Dec 18 18:10:39 EST 2001
infotechsys.wayne at verizon.net:
>Could someone point me to some documentation that show how to use
>HTML ,Dom and Python together. I did a google search, but the only
>thing I find is DOM, XML and Python.
I confess to being confused as well. I give here an example
of what I want to do and how I thought to do it.
I have an /etc/passwd-like XML format like this
<passwd>
<entry>
<account>dalke</account>
<password>*</password>
< .... >
<shell>/bin/tcsh</shell>
</entry>
<entry>
<account>root</account>
....
</passwd>
I want to change my shell entry to /bin/bash. I tried
the following (with Python 2.0, but I doubt 2.2 has changed
things):
>>> from xml.dom import minidom
>>> doc = minidom.parseString("<passwd><entry>"
... "<account>dalke</account>"
... "<shell>/bin/tcsh</shell>"
... "</entry></passwd>")
>>> doc.normalize()
>>> for entry in doc.getElementsByTagName("entry"):
... account = entry.getElementsByTagName("account")[0]
... if account.firstChild.nodeValue == "dalke":
... shell = entry.getElementsByTagName("shell")[0]
... shell.firstChild.nodeValue = u"/bin/bash"
... break
... else:
... print "dalke not found"
...
>>> doc.toxml()
u'<passwd><entry><account>dalke</account><shell>/bin/tcsh</shell>
</entry></passwd>'
>>> shell
<DOM Element: shell at 4836174280>
>>> shell.firstChild.nodeValue
u'/bin/bash'
>>> shell.firstChild.data = u"/bin/bash"
>>> doc.toxml()
u'<passwd><entry><account>dalke</account><shell>/bin/bash</shell>
</entry></passwd>'
My questions are:
1) why does it take so much work to do this?
2) why doesn't the XML output contain the new shell name when
I change "nodeValue"?
3) why does the XML output change when I change 'data' -- and
is that the right way to change the value?
4) is there any way to dump just the raw characters as text
(not in XML)? How?
I would prefer an API which is more like
for entry in doc["entry"]:
if entry["account"][0].text == "dalke":
entry["shell"][0].text = "/bin/bash"
break
and not have to worry about the normalization and explicit
use of firstChild.
And I haven't seen any documentation which introduces Python
programmers to using DOM (like AMK had for SAX parsing) -
only docs for people who already know DOM from Java or other
fields.
So I too am looking for pointers.
Andrew
dalke at dalkescientific.com
More information about the Python-list
mailing list