[issue42536] Iterating on a zip keeps objects alive longer than expected (test_itertools leaks sometimes references)

Pablo Galindo Salgado report at bugs.python.org
Thu Dec 3 18:32:26 EST 2020


Pablo Galindo Salgado <pablogsal at gmail.com> added the comment:

> Or, if z's refcount drops to zero and it's cleaned up, its traverse function may *never* be called, which leaves the untracked r -> o -> r cycle.

This is a real problem, indeed. We would need to add the tracking to the tp_dealloc of the zip object as well.

> I have no idea how problematic tracking and untracking objects *during* collections can be.

It is indeed tricky: the most problematic part of the "surprise tracking" is the validity of the pointers, but the key here is that the traverse function will be called BEFORE (update_refs()) the pointers start being invalid (because of the bit reuse).

> Or, if this group is determined to be unreachable, untrack_tuples untracks r *before* the cycle is cleaned up. That seems like it could be a problem.

It is not: untrack_tuples is called over the reachable set after the gc knows what part of the linked list is isolated garbage. If the group is in the unreachable set, then untrack_tuples won't touch it. 

> I'm worried that the GC may not be able to detect the cycle if it visits o, then z (which *then* starts tracking r), then r.

This is not a problem IMO: if the gc does not see the cycle in that collection, it will in the next ones. Wat will happen is that at the beginning the gc does not see r so it iterates over the linked list of the generation being collected, decrementing the gc refcounts. When it reaches z, it starts decrementing the refcounts of z and that calls the traverse function of z, that tracks r AND decrements its gc refcounts but also it adds "r" to the end of the linked list of the young generation. This means that (assuming we are collecting the middle generation) in this collection the cycle will not be seen, but in a future collection where the young generation is collected then "r" will be visited normally (notice that the gc can see "r" without needing to visit "z" before). Also, notice that untrack_tuples() is called over the reachable set, but "r" is in the young generation so it won't be untracked. Also, if we are collecting the younger generation, then the cycle will be found and it will be cleaned nicely.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue42536>
_______________________________________


More information about the Python-bugs-list mailing list