[Python-Dev] Darwin's realloc(...) implementation never shrinks allocations

Bob Ippolito bob at redivi.com
Mon Jan 3 21:55:19 CET 2005


On Jan 3, 2005, at 3:23 PM, bacchusrx wrote:

> On Thu, Jan 01, 1970 at 12:00:00AM +0000, Tim Peters wrote:
>> Is there any known case where Python performs poorly on this OS, for
>> this reason, other than the "pass giant numbers to recv() and then
>> shrink the string because we didn't get anywhere near that many bytes"
>> case?
>>
>> [...]
>>
>> I agree the socket-abuse case should be fiddled, and for more reasons
>> than just Darwin's realloc() quirks. [...] Yes, in the socket-abuse
>> case, where the program routinely malloc()s strings millions of bytes
>> larger than the socket can deliver, it would obviously help.  That's
>> not typically program behavior (however typical it may be of that
>> specific app).
>
> Note that, with respect to http://python.org/sf/1092502, the author of
> the (original) program was using the documented interface to a file
> object.  It's _fileobject.read() that decides to ask for huge numbers 
> of
> bytes from recv() (specifically, in the max(self._rbufsize, left)
> condition). Patched to use a fixed recv_size, you of course sidestep 
> the
> realloc() nastiness in this particular case.

While using a reasonably sized recv_size is a good idea, using a 
smaller request size simply means that it's less likely that the 
strings will be significantly resized.  It is still highly likely they 
*will* be resized and that doesn't solve the problem that 
over-allocated strings will persist until the entire request is 
fulfilled.

For example, receiving 1 byte chunks (if that's even possible) would 
exacerbate the issue even for a small request size.  If you asked for 8 
MB with a request size of 1024 bytes, and received it in 1 byte chunks, 
you would need a minimum of an impossible ~16 GB to satisfy that 
request (minimum ~8 GB to collect the strings, minimum ~8 GB to 
concatenate them) as opposed to the Python-optimal case of ~16 MB when 
always using compact representations.

Using cStringIO instead of a list of potentially over-allocated strings 
would actually have such Python-optimal memory usage characteristics on 
all platforms.

-bob



More information about the Python-Dev mailing list