[Python-Dev] Darwin's realloc(...) implementation never shrinks allocations

Tim Peters tim.peters at gmail.com
Tue Jan 4 02:38:08 CET 2005


[Bob Ippolito]
> ...
> What about for list objects that are big at some point, then
> progressively shrink, but happen to stick around for a while?  An
> "event queue" that got clogged for some reason and then became stable?

It's less plausible that we''re going to see a lot of these
simultaneously alive.  It's possible, of course.  Note that if we do,
fiddling PyObject_Realloc() won't help:  list resizing goes thru the
PyMem_RESIZE() macro, which calls the platform realloc() directly in a
release build (BTW, I suspect that when you were looking for realloc()
calls, you were looking for the string "realloc(" -- but that's not
the only spelling; we don't even have alphabetical choke points
<wink>).

The list object itself goes thru Python's small-object allocator,
which makes sense because a list object has a small fixed size
independent of list length.  Space for list elements is allocated
seperately from the list object, and talks to the platform
malloc/free/realloc directly (in release builds, via how the PyMem_XYZ
macros resolve in release builds).

> Dictionaries?

They're not a potential problem here -- dict resizing (whether growing
or shrinking) always proceeds by allocating new space for the dict
guts, copying over elements from the original space, then freeing the
original space.  This is because the hash slot assigned to a key can
change when the table size changes, and keeping collision chains
straight is a real bitch if you try to do it in-place.  IOW, there are
implementation reasons for why CPython dicts will probably never use
realloc().

> Of course these potential problems are a lot less likely to happen.

I think so.

Guido's suggestion to look at PyString_Resize (etc) instead could be a
good one, since those methods know both the number of thingies (bytes,
list elements, tuple elements, ...) currently allocated and the number
of thingies being asked for.  That could be exploited by a portable
heuristic (like malloc+memcpy+free if the new number of thingies is at
least a quarter less than the old number of thingies, else let realloc
(however spelled) exercise its own judgment).  Since list_resize()
doesn't go thru pymalloc, that's the only clear way to worm around
realloc() quirks for lists.


More information about the Python-Dev mailing list