[New-bugs-announce] [issue22084] Mutating while iterating

Aaron Brady report at bugs.python.org
Sat Jul 26 17:52:21 CEST 2014


New submission from Aaron Brady:

Hi, I asked about the inconsistency of the "RuntimeError" being raised when mutating a container while iterating over it here [1], "set and dict iteration" on Aug 16, 2012.

[1] http://www.gossamer-threads.com/lists/python/python/1004659

I posted a patch on the ML but never submitted it.  People's reaction seemed ambivalent.  Now I have an idea for a different implementation.  I'd like to take another shot at it.

It's one of the worst silent errors, since there's an error in the *iterator* when we call a *set* method.  We're going to add something to make it safer, at least in the sense of getting a clear failure, if the programmer does something that's always been ill-advised.

We have a number of options for the implementation.  We still have the option to introduce "IterationError", possibly a subclass of "RuntimeError".  These options are still applicable:

1) Collection of iterators
. Invalidate all "open" iterators on mutating
. a) Linked list
.. i) Uncounted references
.. ii) Counted references
.. iii) Weak references
. b) Weak set
2) Version index / timestamp / "memo"
. Iterators check whether the container has been mutated
    since they were created
. a) No overflow - Python longs
.. i) Reset index if no iterators left
. b) Overflow - C ints / shorts (silent error)
3) Iterator count
. Raise exception on mutation, not iteration

The new option is:

2d) Use a dedicated empty *object* for a timestamp or "memo".  A new memo is created on every mutation.  Before advancing, the iterator checks whether the current memo is a different object than it was when it was created.

Costs: The existing silent error is fairly rare.  The container gains a pointer to its current memo.  The iterator loses the cached length but gains a pointer to a memo.  The memos are blank objects: a "Py ssize t" and a pointer with certain flags at time of writing.  Speed is the same: comparing the lengths is replaced with comparing the memos.

Some caveats: The memory manager is used to obtain perpetually unique IDs.  A unique algorithm could be used instead of the memory manager, though the memo needs to contain a reference count more or less regardless.  There can at most be one memo per iterator.  The approach is outlined in pseudocode here [2]. Implementation could be optimized slightly by only creating new memos if iterators have been opened, shown here [3].

[2] http://home.comcast.net/~castironpi-misc/irc-0168%20mutating%20while%20iterating%20markup.html

[3] http://home.comcast.net/~castironpi-misc/irc-0168%20mutating%20while%20iterating%202%20markup.html

----------
components: Library (Lib)
messages: 224071
nosy: castironpi
priority: normal
severity: normal
status: open
title: Mutating while iterating
type: behavior
versions: Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue22084>
_______________________________________


More information about the New-bugs-announce mailing list