[Python-checkins] r46544 - in python/trunk: Include/dictobject.h Objects/dictobject.c

Tim Peters tim.peters at gmail.com
Wed May 31 02:00:31 CEST 2006


[Jim Jewett]
> Why is this being done now?

Because yeserday was the first day I could make sufficient time for
it.  The C API dict functions were converted to Py_ssize_t a few
months ago, but the dict internals weren't changed to match at the
time.  Now they have been.  It was a correctness issue to leave them
mis-matched (although I don't have a box on which failure modes can be
provoked, ways to do so were obvious).

> I see the advantage of being able to use larger dictionaries if you do
> have the RAM, and want a few huge in-memory databases instead of a
> large number of small objects, *and* aren't using something else for
> that database.
>
> I also see a disadvantage in making all dictionary instances a little
> bit larger.

There's no size difference on 32-bit boxes.  On 64-bit boxes there's
at worst a small relative increase.  Don't overlook that PyDictObject
was a large object, and especially on 64-bit boxes.  Changing 3
members from int to Py_ssize_t in the PyDictObject struct makes an
empty dict on a 64-bit box perhaps 4% larger (I don't have a 64-bit
box to check that on -- if you do, print sizeof(PyDictObject) before
and after to find out).

> I think even for the huge RAM case, "lots of little objects" might be
> a more common case.
>
> Even for the single-honkin-dict, would it make sense to have an extra
> search finger member instead of increasing the size of every bucket?

You're talking about the effect of changing the PyDictEntry struct's
me_hash member from long to Py_ssize_t?  If so, I believe that makes
no size difference on any box.  The only known platform on which
sizeof(Py_ssize_t) > sizeof(long) is Win64, so the only known platform
on which that change could possibly increase the PyDictEntry struct;s
size is Win64.  But that struct is going to be 8-byte aligned on Win64
because it also contains 8-byte pointers on Win64, so PyDictEntry
previously had 4 unused pad bytes on Win64.  The net effect of the
change on Win64 is to make use of those 4 unused bytes.

If it did increase the size of PyDictEntry, then yes, crafting a
different search-finger hack would be attractive.  I'm not sure I'd
bother just for Win64, though ;-)


More information about the Python-checkins mailing list