CPython 2.7: Weakset data changing size during internal iteration

Temia Eszteri lamialily at cleverpun.com
Fri Jun 1 23:24:30 EDT 2012


On 02 Jun 2012 03:05:01 GMT, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:

>I doubt that very much. If you are using threads, it is more likely your 
>code has a race condition where you are modifying a weak set at the same 
>time another thread is trying to iterate over it (in this case, to 
>determine it's length), and because it's a race condition, it only 
>happens when conditions are *just right*. Since race conditions hitting 
>are usually rare, you only notice it when there's a lot of data.

Except that the few threads I use don't modify that data at all
because the functions that even touch the references set rely on
OpenGL contexts along with it which are thread-bound, ergo, impossible
to call without stopping the code in its tracks to begin with unless
the context's explicitly shifted (which it very much isn't).

And I've done some looking through the weak set's code in the
intervening time; it does easily have the potential to cause this kind
of problem because the weak references made are set to a callback to
remove them from the data set when garbage is collected. See for
yourself.:

Lines 81-84, _weakrefset.py:

    def add(self, item):
        if self._pending_removals:
            self._commit_removals()
        self.data.add(ref(item, self._remove)) <--

Lines 38-44, likewise: (for some reason called in __init__ rather than
at the class level, but likely to deal with a memory management issue)

        def _remove(item, selfref=ref(self)):
            self = selfref()
            if self is not None:
                if self._iterating: <--
                    self._pending_removals.append(item)
                else:
                    self.data.discard(item) <--
        self._remove = _remove

The thing is, as Terry pointed out, its truth value is tested based on
__len__(), which as shown does NOT set the _iterating protection:

    def __len__(self):
        return sum(x() is not None for x in self.data)

Don't be so fast to dismiss things when the situation would not have
made a race condition possible to begin with.

~Temia
--
When on earth, do as the earthlings do.



More information about the Python-list mailing list