script uses up all memory

Chris Angelico rosuav at gmail.com
Thu Mar 6 18:31:46 EST 2014


On Fri, Mar 7, 2014 at 10:12 AM, Marko Rauhamaa <marko at pacujo.net> wrote:
> Chris Angelico <rosuav at gmail.com>:
>
>> I think this thread is proof that they are to be avoided. The GC
>> wasn't doing its job unless explicitly called on. The true solution is
>> to break the refloop; the quick fix is to call gc.collect(). I stand
>> by the recommendation to put an explanatory comment against the
>> collect call.
>
> What I'm saying is that under most circumstances you shouldn't care if
> the memory consumption goes up and down. The true solution is to not do
> anything about temporary memory consumption. Also, you shouldn't worry
> about breaking circular references. That is also often almost impossible
> to accomplish as so much modern code builds on closures, which generate
> all kinds of circular references under the hood—for your benefit, or
> course.

This isn't a temporary issue, though - see the initial post. After two
hours of five-minutely checks, the computer was wedged. That's a
problem to be solved.

Most of what I do with closures can't create refloops, because the
function isn't referenced from inside itself. You'd need something
like this:

>>> def foo():
    x=1
    y=lambda: (x,y)
    return y
>>> len([foo() for _ in range(1000)])
1000
>>> gc.collect()
4000
>>> len([foo() for _ in range(1000)])
1000
>>> gc.collect()
4000
>>> len([foo() for _ in range(1000)])
1000
>>> gc.collect()
4000

That's repeatably creating garbage. But change the function to not
return itself, and there's no loop:

>>> def foo():
    x=1
    y=lambda: x
    return y
>>> gc.collect()
0
>>> len([foo() for _ in range(1000)])
1000
>>> gc.collect()
0
>>> len([foo() for _ in range(1000)])
1000
>>> gc.collect()
0

The only even reasonably common case that I can think of is a
recursive nested function:

>>> def foo(x):
    def y(f,x=x):
        f()
        for _ in range(x): y(f,x-1)
    return y

It's a function that returns a function that calls its argument some
number of times, where the number is derived in a stupid way from the
argument to the first function. The whole function is garbage, so it's
not surprising that the GC has to collect it.

>>> len([foo(5) for _ in range(1000)])
1000
>>> gc.collect()
3135
>>> len([foo(5) for _ in range(1000)])
1000
>>> gc.collect()
3135
>>> len([foo(5) for _ in range(1000)])
1000
>>> gc.collect()
3135

Can you give a useful example of a closure that does create a refloop?

ChrisA



More information about the Python-list mailing list