[Python-Dev] A "new" kind of leak

Tim Peters tim.one@comcast.net
Sat, 13 Apr 2002 19:02:22 -0400


[James Althoff]
> Remember the exchange below from a while back on python-list (you
> submitted a nice sorting algorithm in response)?

Yup!

> Isn't this the kind of thing that could create lots of
> generators?

Sure -- it creates about as many generators as there are elements in the
list being sorted.  They're all alive simultaneously, though, so the size of
the frame free list doesn't come into play except across distinct top-level
sorts.  Code like that continues to work fine -- it's not a limit on the
number of frames (or generators) you can have, it's only a limit on the
amount of memory *permanently* devoted to holding frames (and nothing but
frames, and even when no frames are actually in use anymore -- once a blob
of memory gets on a type-specific free list, it stays there until Python
shuts down even if it's never used again, and it can't be reused for
anything except an object of the same type; pymalloc at least allows for
reuse based on the number of bytes needed independent of type)).

> I'm not saying one should write this kind of code -- mine was just
> a "for fun" rewrite of David's original example (he seemed to want
> to be able to use generators for quicksort) -- and it was relatively
> slow, in any case.  And of course there are lots of other, better
> ways to get the end result.

No need to make excuses <wink>.  It's a very slick way to get the k smallest
elements in worst-case O(n + k log n) time, and instructive for that reason
alone.  For large enough n and small enough k, it would beat the pants off
peeling the first k elements off the result of doing list.sort(), but the
overhead is so high it might take a value of n so large that you run out of
RAM to hold frames.  Many of the frames in this case stay alive a long time,
and the malloc/free overhead becomes less significant the more work a frame
does over its lifetime.  The frame free list is aimed more at optimizing

def plus1(n):
    return n+1

and other trivial functions and methods.  There a malloc likely costs more
than executing the entire body of the function.