[Python-Dev] Darwin's realloc(...) implementation never shrinks allocations

Tim Peters tim.peters at gmail.com
Mon Jan 3 22:49:29 CET 2005


[Tim Peters]
>> Ya, I understood that.  My conclusion was that Darwin's realloc()
>> implementation isn't production-quality.  So it goes.

[Bob Ippolito]
> Whatever that means.

Well, it means what it said.  The C standard says nothing about
performance metrics of any kind, and a production-quality
implementation of C requires very much more than just meeting what the
standard requires.  The phrase "quality of implementation" is used in
the C Rationale (but not in the standard proper) to cover all such
issues.  realloc() pragmatics are quality-of-implementation issues;
the accuracy of fp arithmetic is another (e.g., if you get back -666.0
from the C 1.0 + 2.0, there's nothing in the standard to justify a
complaint).

>>>  free() can be called either explicitly, or implicitly by calling
>>> realloc() with a size larger than the size of the allocation.

>From later comments feigning outrage <wink>, I take it that "the size
of the allocation" here does not mean the specific number the user
passed to the previous malloc/realloc call, but means whatever amount
of address space the implementation decided to use internally.  Sorry,
but I assumed it meant the former at first.

...

>>> Was this a good decision?  Probably not!

>> Sounds more like a bug (or two) to me than "a decision", but I don't
>> know.

> You said yourself that it is standards compliant ;)  I have filed it as
> a bug, but it is probably unlikely to be backported to current versions
> of Mac OS X unless a case can be made that it is indeed a security
> flaw.

That's plausible.  If you showed me a case where Python's list.sort()
took cubic time, I'd certainly consider that to be "a bug", despite
that nothing promises better behavior.  If I wrote a malloc subsystem
and somebody pointed out "did you know that when I malloc 1024**2+1
bytes, and then realloc(1), I lose the other megabyte forever?", I'd
consider that to be "a bug" too (because, docs be damned, I wouldn't
intentionally design a malloc subsystem with such behavior; and
pymalloc does in fact copy bytes on a shrinking realloc in blocks it
controls, whenever at least a quarter of the space is given back --
and it didn't at the start, and I considered that to be "a bug" when
it was pointed out).

> ...
> Known case?  No.  Do I want to search Python application-space to find
> one?  No.

Serious problems on a platform are usually well-known to users on that
platform.  For example, it was well-known that Python's list-growing
strategy as of a few years ago fragmented address space horribly on
Win9X.  This was a C quality-of-implementation issue specific to that
platform.  It was eventually resolved by improving the list-growing
strategy on all platforms -- although it's still the case that Win9X
does worse on list-growing than other platforms, it's no longer a
disaster for most list-growing apps on Win9X.

If there's a problem with "overallocate then realloc() to cut back" on
Darwin that affects many apps, then I'd expect Darwin users to know
about that already -- lots of people have used Python on Macs since
Python's beginning, "mysterious slowdowns" and "mysterious bloat" get
noticed, and Darwin has been around for a while.

..

>> There is no "choke point" for allocations in Python -- some places
>> call the system realloc() directly.  Maybe the latter matter on Darwin
>> too, but maybe they don't.  The scope of this hack spreads if they do.

...

> In the case of Python, "some places" means "nowhere relevant".  Four
> standard library extension modules relevant to the platform use realloc
> directly:
> 
> _sre
>     Uses realloc only to grow buffers.
> cPickle
>     Uses realloc only to grow buffers.
> cStringIO
>     Uses realloc only to grow buffers.
> regexpr:
>     Uses realloc only to grow buffers.

Good!

> If Zope doesn't use the allocator that Python gives it, then it can
> deal with its own problems.  I would expect most extensions to use
> Python's allocator.

I don't know.

...
 
> They're [#ifdef's] also the only good way to deal with platform-specific
> inconsistencies.  In this specific case, it's not even possible to
> determine if a particular allocator implementation is stupid or not
> without at least using a platform-allocator-specific function to query
> the size reserved by a given allocation.

We've had bad experience on several platforms when passing large
numbers to recv().  If that were addressed, it's unclear that Darwin
realloc() behavior would remain a real issue.  OTOH, it is clear that
*just* worming around Darwin realloc() behavior won't help other
platforms with problems in the same *immediate* area of bug 1092502. 
Gross over-allocation followed by a shrinking realloc() just isn't
common in Python.  sock_recv() is an exceptionally bad case.  More
typical is, e.g., fileobject.c's get_line(), where if "a line" exceed
100 characters the buffer keeps growing by 25% until there's enough
room, then it's cut back once at the end.  That typical use for
shrinking realloc() just isn't going to be implicated in a real
problem -- the over-allocation is always minor.


> ...
> There's obviously a tradeoff between copying lots of bytes and having
> lots of memory go to waste.  That should be taken into consideration
> when considering how many pages could be returned to the allocator.
> Note that we can ask the allocator how much memory an allocation has
> actually reserved (which is usually somewhat larger than the amount you
> asked it for) and how much memory an allocation will reserve for a
> given size.  An allocation resize wouldn't even show up as smaller
> unless at least one page would be freed (for sufficiently large
> allocations anyway, the minimum granularity is 16 bytes because it
> guarantees that alignment).  Obviously if you have a lot of pages
> anyway, one page isn't a big deal, so we would probably only resort to
> free()/memcpy() if some fair percentage of the total pages used by the
> allocation could be rescued.
> 
> If it does end up causing some real performance problems anyway,
> there's always deeper hacks like using vm_copy(), a Darwin specific
> function which will do copy-on-write instead (which only makes sense if
> the allocation is big enough for this to actually be a performance
> improvement).

As above, I'm skeptical that there's a general problem worth
addressing here, and am still under the possible illusion that the Mac
developers will eventually change their realloc()'s behavior anyway. 
If you're convinced it's worth the bother, go for it.  If you do, I
strongly hope that it keys off a new platform-neutral symbol (say,
Py_SHRINKING_REALLOC_COPIES) and avoids Darwin-specific implementation
code.  Then if it turns out that it is a broad problem (across apps or
across platforms), everyone can benefit.  PyObject_Realloc() seems the
best place to put it.  Unfortunately, for blocks obtained from the
system malloc(), there is no portable way to find out how much excess
was allocated in a release-build Python, so "avoids Darwin-specific
implementation code" may be impossible to achieve.  The more it
*can't* be used on any platform other than this flavor of Darwin, the
more inclined I am to advise just fixing the immediate problem
(sock_recv's potentially unbounded over-allocation).


More information about the Python-Dev mailing list