[Python-Dev] Algoritmic Complexity Attack on Python

Tim Peters tim.one@comcast.net
Sun, 01 Jun 2003 22:12:11 -0400


[Guido]
> I hope I can dissuade you from using Python's default hash codes;

It's easy to dissuade me; it's Jim we have to worry about <wink>.

> they differ across platforms with different word lengths, too.  I don't
> know enough about the requirements for this proposed persistent set
> type; perhaps it would be acceptable to require that the set elements
> define a method that returns a unique identifier which is an immutable
> object using only Python built-in types?

I hope so, but don't know the use cases well enough to guess whether that
would be considered good enough.

> ...
> I doubt this is a coincidence.  When designing Python's run time, I
> was very well aware of process boundaries and lifetime, and used them
> to limit the scope of universal truths.  I wasn't really thinking of
> persistence.

Neither was I <wink>.

> ...
> For comparisons, I've long considered the current "everything can be
> ordered with respect to everything else" paradigm broken, and for 3.0
> I plan to break this, allowing only == and != comparisons between
> objects of different types that don't specifically cater to cross-type
> comparisons (the default being unequal if the types differ).  Ordering
> comparisons will default to an exception.

That would go a long way toward keeping people out of trouble with
persistent BTree-based structures.

> But this won't affect hashing; I don't think I'd like to see the hash
> function fixed by the language standard, so the hash function may
> still vary between releases (or between runs, for Python
> implementations that follow Scott's recommendation).

For Python 3 I hope we (you) can consider another line of flexibility too:
sometimes when I build a hash table, I want an __eq__ that isn't "the
natural" __eq__ for the key objects.  For example, using a dict to partition
objects by equivalence class wants to use the equivalence relation of
interest at the moment.  This implies a specific dict may want to use a
"non-standard" __hash__ too.  Hiding the real objects in wrapper objects
works for this purpose, of course, but sometimes it would be a lot more
convenient to pass hash and eq functions to a dict's constructor.