hash issues [WAS] Re: [Tutor] hash()ing a list

Brian van den Broek bvande at po-box.mcgill.ca
Tue Mar 29 11:19:47 CEST 2005


Danny Yoo said unto the world upon 2005-03-29 03:37:
>>*Almost* all ints are fixed points for the hashing function in the
>>sense that hash(some_int) == some_int. Almost all as:
>>
>> >>> hash(-1)
>>-2
>>
>>Any idea why -1 is the sole exception?
> 
> 
> [warning: beginners, skip this.  Completely inconsequential CPython detail
> ahead.]
> 
> Hi Brian,
> 
> Yeah, I remember this from studying the Python C implementation.  -1 is
> used internally in the CPython implementation to represent that the hash
> function failed in some bad way.  It is a pure CPython implementation
> detail, and is completely not important. *grin*


Thanks Danny,

I did get that it wasn't important as clearly with finitely many ints 
available to be hash values, uniqueness of hash is out the window 
independently. But the oddity here did pique my curiosity. The 
explanation is obvious, once someone else thought of it for me :-) Thanks.

<SNIP quote from C sources which sooner or later I am going to have to 
learn how to read :-) >

> 
>>I had thought lookup was by hash value, and thus expected the access to
>>some_dict to cause troubles. Yet it worked. Is it that lookup is by hash
>>value, and then equality if need be so as to settle ambiguity,
> 
> 
> Yes, exactly!  That's why equality and hashing codes are tied together in
> this kind of "if a==b then hash(a)==hash(b)" contract: there's bound to be
> "collisions" in any hashing scheme.  So we have to do something about
> collisions.  Equality is meant to settle things when two keys have the
> same hash code.

That's nicely put :-)

> I just did a quick Google search to see if someone else has written about
> hash table theory, and trusty Wikipedia came up with a very nice article:
> 
>     http://en.wikipedia.org/wiki/Hash_table

Thanks for the link.

Best to all,

Brian vdB



More information about the Tutor mailing list