[XML-SIG] XPath's reliance on id()

Martijn Faassen faassen@vet.uu.nl
Wed, 13 Mar 2002 19:24:02 +0100


Martin v. Loewis wrote:
> Martijn Faassen <faassen@vet.uu.nl> writes:
> 
> > Now, it's hard to see how to make this code work without some hashing,
> > so we seem to have to rely on *something*. But can't we make it 
> > use __hash__ instead of using id? I think I can make this work for
> > ParsedXML (in fact I tried and at least it's not giving any errors).
> 
> I think this approach (for determining document order) really does
> need the notion of identity, so it's hard to believe that this could
> actually work.

Doesn't __hash__ imply there is a notion of identity?

> >   * all DOMs that expect to work with PyXML's xpath need to define 
> >     __hash__(). In the simplest case this simply does an id() on the
> >    node.
> 
> Those DOMs that use Python classes would not need to do anything at
> all, right?

True, if you use hash() that should use with Python classes, unless of
course your objects are really only proxying an underlying database and
multiple objects may proxy the same data (in that case they should
compare equal).

Reading the language reference on hash I do find this phrase on __hash__
that is disturbing, though:

  If a class does not define a __cmp__() method it should not define a 
  __hash__() operation either;

I don't understand why this must be so.

The point is simply that id() is not customizable, and hash() is.

> >   * we need to alter the code in xpath/Util.py so as not to rely on
> >     id() anymore but to use the node objects as hash keys directly. 
> 
> Then doing what? What happens if you have two nodes that have the same
> hash value, but which are different?

I reconsidered later; I simply want to use hash() instead of id. In that
case you don't have a problem with two objects having the same hash value
but are different. If they have the same hash value they're identical.

> Do you think you could make this work by giving each ParseXML Node a
> docIndex attribute? With that, SortByDocOrder would not even attempt
> to put node ids into a dictionary.

I hadn't considered that, but at first glance it seems to be fairly icky to
support this. These ParsedXML documents are usually persistent in the ZODB
and it'd be adding yet another field to it all. Plus DOM manipulations would
mean possible large scale changes of docIndex all over the tree..

Regards,

Martijn