[Python-Dev] a feature i'd like to see in python #1: better iteration control

Ben Wing ben at 666.com
Mon Dec 4 04:27:51 CET 2006



Brian Harring wrote:
> On Sun, Dec 03, 2006 at 08:35:58PM -0600, Ben Wing wrote:
>   
>> but i still don't see why supporting iter.delete() is so wrong.  clearly 
>> it doesn't need to work on files or other such things where it doesn't 
>> make sense.
>>
>> before you diss this completely, note that java supports exactly the 
>> same thing:
>>
>> http://java.sun.com/j2se/1.4.2/docs/api/java/util/Iterator.html
>>     
>
> Not all iterators would support remove; that right there is a bit of 
> an issue since right now, the only exception you need to expect for 
> iterator protocol is StopIteration being thrown when the iterator has 
> nothing more to yield.
>
> So, it's no longer simpler, which is a bit of a con in my opinion.
>   
well, it's not like adding a new exception to any existing iterator 
method.  it would only occur if you call iter.delete() in the wrong context.
> For dict; it actually *cannot* work.  You can't remove keys from a 
> dict as you're iterating over it (can change the val of a key, but not 
> remove the key).  So iter.delete would require fair bit of changes 
> internally to dict, either tracking what it's yielded already, or 
> forcing iterkeys to actually be iter(keys()) (creating an intermediate 
> list), which is worse for memory usage and general performance.
>
> Set's suffer the same thing; can't change what it contains while 
> iterating, have to restart the iteration after a removal/addition.
>   

> Leaves lists... which personally, I view as a mostly bad thing to be 
> doing anyways.  Trying to pop an item out of the middle of a list 
> results in shifting everything right of it one spot to the left; this 
> sucks from a performance standpoint, again, worst case, quad.
>
> Now... occasionally, have to do it admittedly.  But it's not something 
> you actaully want to be doing in your code all that much- admittedly 
> generating a new list to avoid that hit also sucks somewhat, but the 
> worst case there is far more behaved, a temp trade of space vs 
> runtime.
>
> What I'm trying to get at is that what iter(list).next().delete() 
> would do isn't a good thing to paper over, it makes the op look like 
> it costs nothing when it can cost a _lot_.
>   
i do see your point.  i was trying to remember what i ended up doing 
when i ran into this issue before, and in fact i ended up just making a 
new list and appending all the elements i didn't want to delete.  i see 
you'd get N^2 behavior if you deleted lots of elements from a list, but 
you get the same thing currently if you call `del list[i]' a lot; it's 
not obvious that iter.delete() actually "papers over" the cost any more 
than `del list[i]' does.
> Unless I'm missing something, the only real usage of this is a list 
> (could do it on files also, although that would suffer the same 
> issue as a list, just worse via truncate calls).  Would work better on 
> collections.deque, but deque is a linked list and used rather 
> selectively.
>
>   
i actually think now this would be best for hash tables and such.  
copying a large hash table is a real drag, and using the iterator 
protocol is the *only* way to iterate over the elements of a hash 
table.  in fact, i imagine this is exactly why java added this 
functionality.  certainly in lisp, the `maphash' function explicitly 
allows you to delete the current element being iterated over, for the 
same reason.

however, for hash tables there's no reason to use iter.delete().  you 
should just be able to write `del hash[x]'.  is this disallowed 
currently?  if so, it seems like something that should be fixed.

> So... why add it, if it's basically for one major type, and it's 
> not a good idea to blindly do such an action on that type in the first 
> place? 
>   
ok, i see your point and i retract the suggestion.

ben



More information about the Python-Dev mailing list