[Tutor] pyXML DOM 2.0 Traversal and filters

Levy Lazarre llazarre@yahoo.com
Wed Apr 30 11:07:02 2003


--- Danny Yoo <dyoo@hkn.eecs.berkeley.edu> wrote:

> So you may find that this will work:
> 
> ###
> class FilterBroken(NodeFilter):
>     def acceptNode(self, thisNode):
>         if (thisNode.nodeType ==
> thisNode.ELEMENT_NODE and
>                 thisNode.getAttribute("status") ==
> "broken"):
>             return NodeFilter.FILTER_REJECT
>         return NodeFilter.FILTER_ACCEPT
> 
> reader = Sax2.Reader()
> input_file = file("appliances.xml")
> doc = reader.fromStream(input_file)
> walker = doc.createTreeWalker(doc.documentElement,
>                               NodeFilter.SHOW_ALL,
>                               FilterBroken(), 0)
> ###
> 
> 
> If this does do the trick, let's send a holler to
> the pyxml documentation
> maintainers and get them to fix their documentation.
>  *grin*
> 
> 
> Hope this helps!
> 

Thanks Danny. Great insight.It did the trick! As you
said, the documentation is wrong and
createTreeWalker() expects a class instance, not a
function. This is the same in Java.
Interestingly, I did some more research after I sent
my message to the List. I found an article that showed
me how to do the same thing without using a
TreeWalker, but with generators. The resulting code is
shorter and faster, and actually uses a function as
filter. Here it is:

from __future__ import generators
from xml.dom import Node
from xml.dom import minidom

# doc_order_iter_filter iterates over non-attribute
nodes, returning those that pass a filter

def doc_order_iter_filter(node, filter_func):
    if filter_func(node):
        yield node
    for child in node.childNodes:
        for cn in doc_order_iter_filter(child,
filter_func):
            yield cn
    return

# This filter function rejects element nodes with a
"status"  attribute of "broken":
 
elem_filter = lambda n: n.nodeType ==
Node.ELEMENT_NODE and n.getAttribute("status") !=
"broken"

##### main program

doc = minidom.parse('appliances.xml')
for node in doc_order_iter_filter(doc, elem_filter):
    print node


__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com