Garbage collection problem with generators

Haochuan Guo guohaochuan at gmail.com
Fri Dec 23 07:39:17 EST 2016


Hi, everyone

I'm building a http long polling client for our company's discovery service
and something weird happened in the following code:

```python
while True:
    try:
        r = requests.get("url", stream=True, timeout=3)
        for data in r.iter_lines():
            processing_data...
    except TimeoutException:
        time.sleep(10)
```

When I deliberately times out the request and then check the connections
with `lsof -p process`, I discover that there are *two active
connections*(ESTABLISH)
instead of one. After digging around, it turns out it might not be the
problem with `requests` at all, but gc related to generators.

So I write this script to demonstrate the problem:

https://gist.github.com/wooparadog/766f8007d4ef1227f283f1b040f102ef

Function `A.a` will return a generator which will raise an exception. And
in function `b`, I'm building new a new instance of `A` and iterate over
the exception-raising generator. In the exception handler, I'll close the
generator, delete it, delete the `A` instance, call `gc.collect()` and do
the whole process all over again.

There's another greenlet checking the `A` instances by using
`gc.get_objects()`. It turns out there are always two `A` instances.

This is reproducible with python2.7, but not in python3.5. I've also tried
with `thread` instead of `gevent`, it still happens. I'm guessing it's
related to garbage collection of generators.

Did I bump into a python2 bug? Or am I simply wrong about the way to close
generators...?

Thanks



More information about the Python-list mailing list