[Python-Dev] Locked-in defect? 32-bit hash values on 64-bit builds

Reid Kleckner reid.kleckner at gmail.com
Fri Oct 15 23:35:25 CEST 2010


On Fri, Oct 15, 2010 at 4:10 PM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
>
> On Oct 15, 2010, at 10:40 AM, Benjamin Peterson wrote:
>
>> I think the panic is a bit of an overreaction. PEP 384 has still not
>> been accepted, and I haven't seen a final decision about freezing the
>> ABI in 3.2.
>
> Not sure where the "panic" seems to be.
> I just want to make sure the ABI doesn't get frozen
> before hash functions are converted to Py_ssize_t.
>
> Even if the ABI is nor frozen at 3.2 as Martin has proposed,
> it would still be great to get this in for 3.2
>
> Fortunately, this doesn't affect everyday users, it only
> arises for very large datasets.  When it does kick-in though
> (around 2**32 entries), the degradation is not small, it
> is close to catastrophic, making dicts/set unusable
> where O(1) lookups become O(n) with a *very* large n.

Just to be clear, hashing right now just uses the C long type.  The
only major platform where sizeof(long) < sizeof(Py_ssize_t) is 64-bit
Windows, right?  And the change being proposed is to make tp_hash
return a Py_ssize_t instead of a long, and then make all the clients
of tp_hash compute with Py_ssize_t instead of long?

Reid


More information about the Python-Dev mailing list