[Python-Dev] Py_ssize_t
"Martin v. Löwis"
martin at v.loewis.de
Tue Feb 20 12:16:23 CET 2007
Raymond Hettinger schrieb:
> If I'm understanding what was done for dictionaries, the hash table can grow
> larger than the range of hash values. Accordingly, I would expect large
> dictionaries to have an unacceptably large number of collisions. OTOH, we
> haven't heard a single complaint, so perhaps my understanding is off.
I think this would happen, but users don't have enough memory to notice
it. For a dictionary with more than 4GEntries, you need 72GiB memory
(8 byte for each key, value, and cached-hash). So you are starting
to see massive collisions only when you have that much memory - plus
in that dictionary, you would also need space for keys and values.
Very few people have machines with 128+GiB main memory, so no complaints
yet.
But you are right: extending the hash value to be a 64-bit quantity
was "forgotten", mainly because it isn't a count of something - and
being "count of something" was the primary criterion for the 2.5 changes.
> The other area where I expected to hear wailing and gnashing of teeth is users
> compiling with third-party extensions that haven't been updated to a Py_ssize_t
> API and still use longs. I would have expected some instability due to the size
> mismatches in function signatures -- the difference would only show-up with
> giant sized data structures -- the bigger they are, the harder they fall. OTOH,
> there have not been any compliants either -- I would have expected someone to
> submit a patch to pyport.h that allowed a #define to force Py_ssize_t back to a
> long so that the poster could make a reliable build that included non-updated
> third-party extensions.
On most 64-bit systems, there is also an option to run 32-bit programs
(atleast on AMD64, Sparc-64, and PPC64 there is). So people are more
likely to do that when they run into problems, rather than recompiling
the 64-bit Python.
> In the absence of a bug report, it's hard to know whether there is a real
> problem. Have all major third-party extensions adopted Py_ssize_t or is some
> divine force helping unconverted extensions work with converted Python code?
I know Matthias Klose has fixed all extension modules in the entire
Debian source to compile without warnings on 64-bit machines. They may
not work all yet, but yes, for all modules in Debian, it has been fixed.
Not sure whether Matthias is a divine force, but working for Canonical
comes fairly close :-)
> Maybe the datasets just haven't gotten big enough yet.
Primarily that. We still have a few years ahead to find all bugs
before people would start complaining that Python is unstable on
64-bit systems. By the time people would actually see problems,
hopefully they all have been resolved.
Regards,
Martin
More information about the Python-Dev
mailing list