mutable list iterators - a proposal

Terry Reedy tjreedy at udel.edu
Mon Mar 15 11:08:53 EST 2004


"Jess Austin" <jaustin at post.harvard.edu> wrote in message
news:e4e0340c.0403150334.4faeb3a3 at posting.google.com...
> hi,
>
> I like the way that Python does lists, and I love the way it does
> iterators.  But I've decided I don't like what it does with iterators
> of lists.  Lists are supposed to be mutable sequences,

and they are.

> but try to use an iterator of a list that you're mutating
> and watch it all fall to pieces.

Replacing items, which is by far the most common need, works fine.

> That is, if you change the length of a section of the list
> through which the iterator has already passed, it will lose track of
> where it is.

Same is true of dicts.  The limitation is documented.  The need is
relatively rare.  There are several alternatives, usually better.

1. make a copy and iterate through that.
2. create a new object as you iterate (for deletes only, use filter to do
this)
3. make a list of desired edits (something like your _mutations list) and
apply that *after* the iteration to do all the edits at once.

Note that filter is O(n) timewise.  Simply deleting items in place and
closing up the gap, one at a time, as your approach would appear to do, is
O(n**2).  Applying an edit list in a batch after the initial scan may also
be O(n) for typical cases.

4. Actually, for deletion (filtration) in place, there *is* an O(n) single
scan method.  The secret is to *not* change the length of the list while
iterating but to merely move kept objects to the end of a 'kept' section of
the list (using two indexes) and then truncate the list (deleting the now
unneeded tail) *after* the scan.

5. For insertion while scanning, use an index var in a while loop and bump
up the index after doing insertions.  But notice again that this can make
for an O(n**2) algorithm and it might really be better to make a list of
slices and then paste them together in a separate post-scan batch process.

6.  Or, insert a large empty slice (hopefully large enough to accomodate
the net expansion) at the front of the list before starting and then copy
and insert into the empty space as you scan.  (This is a generalization of
the deletion algorithm, 4.)  If you run out of room due to an initial
underestimate, make another large insertion.  Truncate after if there is
leftover space at the end.

Bottom line: because of the O(n**2) trap, insertion and deletion are
trickier than mere replacement.

>  I think this should be fixed for 2.4.  I'm not sure if
> this is a bug or a PEP, so I'd like to hear from others on this
> newsgroup.

The listobject.c code is being reworked to make common list operations
quite noticeably faster.  A slowdown to enable doing something that is
better done elsewise will not be accepted.  In most cases, the limitation
protects people from making a bad algorithm choice.

Terry J. Reedy







More information about the Python-list mailing list