[XML-SIG] get the abolute path for a node

Uche Ogbuji uche.ogbuji at fourthought.com
Tue Aug 10 02:30:06 CEST 2004


On Thu, 2004-08-05 at 09:29, xmlsig at codeweld.com wrote:
> The line that ranslates '#text' to 'text()' has the advantage that it translates
> the path to a valid xpath the other line that eliminates [1] still preserves
> this valid xpath, and I thought it's nicer to look at :).
> I found the source and the cure of the problem. The source is ( as you can
> easely verify with http://www.codeweld.com/files/dom_view.pyw, just use
> 'file://yourfile.xml' )

Niiiiiice.  I'll have to highlight this code in one of my columns, if
that's OK with you.  Of course I think

import xml.dom.ext.reader.Sax2 as Sax2

is probably a bad idea, though I'm not sure what the best alternatives
are to

import xml.dom.ext.reader.HtmlLib as HtmlLib

Do you have any discussion or docs on this code?

> that the Sax2 reader for some reason puts a second node
> with the same nodeName in. The cure is to take for comparision the localName, as
> this name seems to be different for those. Additionaly he's also different for
> some other nodes which might otherwise in border situations made trouble. This
> is the new function. ( I also gave one variable a more reasonable name, was
> confusing otherwise )
> 
> def abs_path( node ):
>     successors = 1
>     previous = node.previousSibling
>     while previous:
>         if previous.localName == node.localName: successors += 1
>         previous = previous.previousSibling
>     path = '/%s[%s]' % (node.nodeName, successors)
>     if node.parentNode.nodeName != '#document':
>         return abs_path( node.parentNode )+path
>     return path

Cool.  I took this as a starting point to add such a function to my
domtools.py

http://cvs.4suite.org/cgi-bin/viewcvs.cgi/Anobind/domtools.py

For convenience, here's my version:

from xml.dom import Node

#The abs_path is based on code developed by "Florian" on XML-SIG
#http://mail.python.org/pipermail/xml-sig/2004-August/010423.html
def abs_path( node ):
    """
    Return an XPath expression that provides a unique path to
    the given node (only supoports elements, attributes and
    root nodes) within a document
    """
    #is_domlette = hasattr(node, 'rootNode')
    if node.nodeType == Node.ELEMENT_NODE:
        successors = 1
        #Determine how many previous siblings there are with the same
node name
        previous = node.previousSibling
        while previous:
            if previous.localName == node.localName: successors += 1
            previous = previous.previousSibling
        step = u'%s[%i]' % (node.nodeName, successors)
        ancestor = node.parentNode
    elif node.nodeType == Node.ATTRIBUTE_NODE:
        step = u'@%s' % (node.nodeName)
        ancestor = node.ownerElement
    elif not node.parentNode:
        step = u''
        ancestor = node
    else:
        raise TypeError('Unsupported node type for abs_path')
    if ancestor.parentNode:
        return abs_path(ancestor) + u'/' + step
    else:
        return u'/' + step


-- 
Uche Ogbuji                                    Fourthought, Inc.
http://uche.ogbuji.net    http://4Suite.org    http://fourthought.com
Decomposition, Process, Recomposition - http://www.xml.com/pub/a/2004/07/28/py-xml.html
Perspective on XML: Steady steps spell success with Google - http://www.adtmag.com/article.asp?id=9663
Managing XML libraries - http://www.adtmag.com/article.asp?id=9160
Commentary on "Objects. Encapsulation. XML?" - http://www.adtmag.com/article.asp?id=9090
Harold's Effective XML - http://www.ibm.com/developerworks/xml/library/x-think25.html
A survey of XML standards - http://www-106.ibm.com/developerworks/xml/library/x-stand4/



More information about the XML-SIG mailing list