[Tutor] pyXML DOM 2.0 Traversal and filters
Levy Lazarre
llazarre@yahoo.com
Wed Apr 30 11:07:02 2003
--- Danny Yoo <dyoo@hkn.eecs.berkeley.edu> wrote:
> So you may find that this will work:
>
> ###
> class FilterBroken(NodeFilter):
> def acceptNode(self, thisNode):
> if (thisNode.nodeType ==
> thisNode.ELEMENT_NODE and
> thisNode.getAttribute("status") ==
> "broken"):
> return NodeFilter.FILTER_REJECT
> return NodeFilter.FILTER_ACCEPT
>
> reader = Sax2.Reader()
> input_file = file("appliances.xml")
> doc = reader.fromStream(input_file)
> walker = doc.createTreeWalker(doc.documentElement,
> NodeFilter.SHOW_ALL,
> FilterBroken(), 0)
> ###
>
>
> If this does do the trick, let's send a holler to
> the pyxml documentation
> maintainers and get them to fix their documentation.
> *grin*
>
>
> Hope this helps!
>
Thanks Danny. Great insight.It did the trick! As you
said, the documentation is wrong and
createTreeWalker() expects a class instance, not a
function. This is the same in Java.
Interestingly, I did some more research after I sent
my message to the List. I found an article that showed
me how to do the same thing without using a
TreeWalker, but with generators. The resulting code is
shorter and faster, and actually uses a function as
filter. Here it is:
from __future__ import generators
from xml.dom import Node
from xml.dom import minidom
# doc_order_iter_filter iterates over non-attribute
nodes, returning those that pass a filter
def doc_order_iter_filter(node, filter_func):
if filter_func(node):
yield node
for child in node.childNodes:
for cn in doc_order_iter_filter(child,
filter_func):
yield cn
return
# This filter function rejects element nodes with a
"status" attribute of "broken":
elem_filter = lambda n: n.nodeType ==
Node.ELEMENT_NODE and n.getAttribute("status") !=
"broken"
##### main program
doc = minidom.parse('appliances.xml')
for node in doc_order_iter_filter(doc, elem_filter):
print node
__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com