Mutable objects which define __hash__ (was Re: Why are tuples immutable?)

Nick Coghlan ncoghlan at iinet.net.au
Thu Dec 30 02:36:57 EST 2004


Bengt Richter wrote:
> Essentially syntactic sugar to avoid writing id(obj) ? (and to get a little performance
> improvement if they're written in C). I can't believe this thread came from the
> lack of such sugar ;-)

The downside of doing it that way is you have no means of getting from the id() 
stored as a key back to the associated object. Meaningful iteration (including 
listing of contents) becomes impossible. Doing the id() call at the Python level 
instead of internally to the interpreter is also relatively expensive.

> Or, for that matter, (if you are the designer) giving the objects an
> obj.my_classification attribute (or indeed, property, if dynamic) as part
> of their initialization/design?

The main mutable objects we're talking about here are Python lists. Selecting an 
alternate classification schemes using a subclass is the current recommended 
approach - this thread is about alternatives to that.

I generally work with small enough data sets that I just use lists for 
classification (sorting test input data into inputs which worked properly, and 
those which failed for various reasons). However, I can understand wanting to 
use a better data structure when doing frequent membership testing, *without* 
having to make fundamental changes to an application's object model.

> Or subclass your graph node so you can do something readable like
>     if node.is_leaf: ...
> instead of
>     if my_obj_classification[id(node)] == 'leaf': ...
I'd prefer:
   if node in leaf_nodes:
     ...

Separation of concerns suggests that a class shouldn't need to know about all 
the different ways it may be classified. And mutability shouldn't be a barrier 
to classification of an object according to its current state.

>>Hence why I suggested Antoon should consider pursuing collections.identity_dict 
>>and collections.identity_set if identity-based lookup would actually address his 
>>requirements. Providing these two data types seemed like a nice way to do an end 
>>run around the bulk of the 'potentially variable hash' key problem.
> 
> I googled for those ;-) I guess pursuing meant implementing ;-)

Yup. After all, the collections module is about high-performance datatypes for 
more specific purposes than the standard builtins. identity_dict and 
identity_set seem like natural fits for dealing with annotation and 
classification problems where you don't want to modify the class definitions for 
the objects being annotated or classified.

I don't want the capability enough to pursue it, but Antoon seems reasonably 
motivated :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net



More information about the Python-list mailing list