[XML-SIG] Useful Python 2.2 tools for the DOM hacker
Uche Ogbuji
uche.ogbuji@fourthought.com
22 Jun 2002 15:29:59 -0600
--=-ZJpz5ChmaCs1c1poLgWW
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
I've kicked up a few useful routines today for DOM processing with
Python 2.2. (BTW, generators kick ass).
I've attched a module of these routines just in case they come in handy
to others. I must say that the combo of generators/iterators and list
comps makes this code extraordinarily cleaner and faster than it would
have been in, say Python 1.5.
BTW, if someone more graphically inclined than I wants to write the
stubbed out function dom_trace and post it back here, I'd be happy to go
into debt of a beer. What would really be cool is a routine that emits
an SVG diagram of a DOM's contents. Based on stuff I've seen done with
SVG, this should be quite feasible.
--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://4Suite.org http://fourthought.com
Track chair, XML/Web Services One (San Jose, Boston):
http://www.xmlconference.com/
DAML Reference - http://www.xml.com/pub/a/2002/05/01/damlref.html
The Languages of the Semantic Web -
http://www.newarchitectmag.com/documents/s=2453/new1020218556549/index.html
XML, The Model Driven Architecture, and RDF @ XML Europe -
http://www.xmleurope.com/2002/kttrack.asp#themodel
--=-ZJpz5ChmaCs1c1poLgWW
Content-Disposition: attachment; filename=DomTools.py
Content-Transfer-Encoding: quoted-printable
Content-Type: text/x-python; name=DomTools.py; charset=ISO-8859-1
########################################################################
#
# File Name: DomARama.py
#
"""
DOM Processing Utilities: Python 2.2 only
Copyright 2002 Uche Ogbuji http://uche.ogbuji.net
http://4suite.org/
"""
from __future__ import generators
from xml.dom import Node
def in_order_iterator(node):
yield node
for child in node.childNodes:
for cn in in_order_iterator(child):
yield cn
return
def in_order_iterator_filter(node, filter_func):
if filter_func(node):
yield node
for child in node.childNodes:
for cn in in_order_iterator_filter(child, filter_func):
if filter_func(cn):
yield cn
return
def get_elements_by_tag_name_ns(node, ns, local):
return in_order_iterator_filter(node, lambda n: n.nodeType =3D=3D Node.=
ELEMENT_NODE and n.namespaceURI =3D=3D ns and n.localName =3D=3D local)
def string_value(node):
text_nodes =3D in_order_iterator_filter(node, lambda n: n.nodeType =3D=
=3D Node.TEXT_NODE)
return u''.join([ n.data for n in text_nodes ])
#def DomTrace(node):
# """
# Display a rudimentary diagram of a DOM's contents
# """
if __name__ =3D=3D "__main__":
DOC =3D """<spam xmlns:x=3D'http://spam.com'>eggs<monty>python</monty><=
/spam>
"""
from Ft.Xml.Domlette import NonvalidatingReader
doc =3D NonvalidatingReader.parseString(DOC, "http://spam.com/base")
print "All nodes:"
for node in in_order_iterator(doc):
print node
print "Elements only:"
for node in in_order_iterator_filter(
doc, lambda x: x.nodeType =3D=3D Node.ELEMENT_NODE):
print node
print "Get elements by tag name:"
for node in get_elements_by_tag_name_ns(doc, None, 'monty'):
print node
print "String value:"
print string_value(doc)
--=-ZJpz5ChmaCs1c1poLgWW--