[Python-Dev] RE: [Python-iterators] Death by Leakage

Tim Peters tim.one@home.com
Mon, 25 Jun 2001 05:05:23 -0400


Here's a simpler leaker, amounting to an insanely convoluted way to generate
the ints 1, 2, 3, ...:

DO_NOT_LEAK = 1

class LazyList:
    def __init__(self, g):
        self.sofar = []
        self.fetch = g.next

    def __getitem__(self, i):
        sofar, fetch = self.sofar, self.fetch
        while i >= len(sofar):
            sofar.append(fetch())
        return sofar[i]

    def clear(self):
        self.__dict__.clear()

def plus1(g):
    for i in g:
        yield i + 1

def genm23():
    yield 1
    for i in plus1(m23):
        yield i

for i in range(10000):
    m23 = LazyList(genm23())
    [m23[i] for i in range(50)]
    if DO_NOT_LEAK:
        m23.clear()

Neil, it would help if genobjects had a memberlist so that the struct
members were discoverable from Python code; that would also let me add
appropriate methods to Cyclops.py to find cycles automatically.

Anyway, m23 is a LazyList instance, where m23.fetch is genm23().next, i.e.
m23.fetch is s bound method of the genm23() generator-iterator.  So the
frame for genm23 is reachable from m23. __dict__.  That frame contains an
anonymous (it's living in the frame's valuestack) generator-iterator thingie
corresponding to the plus1(m23) call.  *That* generator's frame in turn has
m23 in its locals (m23 was an argument to plus1), and another iterator
method referencing m23 in its valuestack (due to the "for i in g").  But m23
is the LazyList instance we started with, so there's a cycle, and clearing
m23.__dict__ breaks it.  gc doesn't chase generators or frames, so it can't
clean this stuff up if we don't clear the dict.

So this appears hopeless unless gc adds both generators and frames to its
repertoire.  OTOH, it's got to be rare -- maybe <wink>.  Worth it?