[Python-Dev] Py_ssize_t

"Martin v. Löwis" martin at v.loewis.de
Tue Feb 20 12:16:23 CET 2007


Raymond Hettinger schrieb:
> If I'm understanding what was done for dictionaries, the hash table can grow 
> larger than the range of hash values.  Accordingly, I would expect large 
> dictionaries to have an unacceptably large number of collisions.  OTOH, we 
> haven't heard a single complaint, so perhaps my understanding is off.

I think this would happen, but users don't have enough memory to notice 
it. For a dictionary with more than 4GEntries, you need 72GiB memory
(8 byte for each key, value, and cached-hash). So you are starting
to see massive collisions only when you have that much memory - plus
in that dictionary, you would also need space for keys and values.
Very few people have machines with 128+GiB main memory, so no complaints
yet.

But you are right: extending the hash value to be a 64-bit quantity
was "forgotten", mainly because it isn't a count of something - and
being "count of something" was the primary criterion for the 2.5 changes.

> The other area where I expected to hear wailing and gnashing of teeth is users 
> compiling with third-party extensions that haven't been updated to a Py_ssize_t 
> API and still use longs.  I would have expected some instability due to the size 
> mismatches in function signatures -- the difference would only show-up with 
> giant sized data structures -- the bigger they are, the harder they fall.  OTOH, 
> there have not been any compliants either -- I would have expected someone to 
> submit a patch to pyport.h that allowed a #define to force Py_ssize_t back to a 
> long so that the poster could make a reliable build that included non-updated 
> third-party extensions.

On most 64-bit systems, there is also an option to run 32-bit programs 
(atleast on AMD64, Sparc-64, and PPC64 there is). So people are more 
likely to do that when they run into problems, rather than recompiling
the 64-bit Python.

> In the absence of a bug report, it's hard to know whether there is a real 
> problem.  Have all major third-party extensions adopted Py_ssize_t or is some 
> divine force helping unconverted extensions work with converted Python code? 

I know Matthias Klose has fixed all extension modules in the entire 
Debian source to compile without warnings on 64-bit machines. They may
not work all yet, but yes, for all modules in Debian, it has been fixed.

Not sure whether Matthias is a divine force, but working for Canonical
comes fairly close :-)

> Maybe the datasets just haven't gotten big enough yet.

Primarily that. We still have a few years ahead to find all bugs
before people would start complaining that Python is unstable on
64-bit systems. By the time people would actually see problems,
hopefully they all have been resolved.

Regards,
Martin



More information about the Python-Dev mailing list