For-each behavior while modifying a collection

Valentin Zahnd v.zahnd at gmail.com
Thu Nov 28 17:14:23 EST 2013


2013/11/28 Ned Batchelder <ned at nedbatchelder.com>:
> On 11/28/13 10:49 AM, Valentin Zahnd wrote:
>>
>> Hello
>>
>> For-each does not iterate ober all entries of collection, if one
>> removes elements during the iteration.
>>
>> Example code:
>>
>> def keepByValue(self, key=None, value=[]):
>>      for row in self.flows:
>>          if not row[key] in value:
>>              self.flows.remove(row)
>>
>> It is clear why it behaves on that way. Every time one removes an
>> element, the length of the colleciton decreases by one while the
>> counter of the for each statement is not.
>> The questions are:
>> 1. Why does the interprete not uses a copy of the collection to
>> iterate over it? Are there performance reasons?
>
>
> Because implicit copying would be pointless in most cases.  Most loops don't
> even want to modify the collection, why copy all iterables just in case your
> loop might be one of the tiny few that might change the collection?
>

Okay, I get this point.
But how is the for-each parsed. Is it realised with a iterator or ist
done with a for loop where a counter is incremented whilst it is less
than the length of the list and reads the element at the index of the
counter?

> Of course, if that prices is acceptable to you, you could do the copy
> yourself:
>
>     for row in list(self.flows):
>         if row[key] not in value:
>             self.flows.remove(row)
>
>
>> 2. Why is the counter for the iteration not modified?
>
>
> Because the list and the iterator over the list are different objects. I
> suppose the list and the iterator could have been written to update when the
> list is modified, but it could get pretty complicated, even more so if you
> want to do the same for other collections like dictionaries.
>
> The best advice is: don't modify the list, instead make a new list:
>
>     self.flows = [r for r in self.flows if r[key] not in value]
>
How is the list comprehension done by the interpreter?
The list I'm have to work with is not that small. So to avoid
duplicated parts in memory the function looks currently like this:

     def keepByValue(self, key=None, value=[]):
         tmpFlows = []
         while len(self.flows) > 0:
             row = self.flows.pop()
             if row[key] in value:
                 tmpFlows.append(row)
         self.flows = tmpFlows

If there is no duplication in memory, the list comprehension would be
much more elegant.

> Be careful though, since there might be other references to the list, and
> now you have two.
>
> --Ned.
>
>>
>> Valentin
>>
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list


2013/11/28 Ned Batchelder <ned at nedbatchelder.com>:
> On 11/28/13 10:49 AM, Valentin Zahnd wrote:
>>
>> Hello
>>
>> For-each does not iterate ober all entries of collection, if one
>> removes elements during the iteration.
>>
>> Example code:
>>
>> def keepByValue(self, key=None, value=[]):
>>      for row in self.flows:
>>          if not row[key] in value:
>>              self.flows.remove(row)
>>
>> It is clear why it behaves on that way. Every time one removes an
>> element, the length of the colleciton decreases by one while the
>> counter of the for each statement is not.
>> The questions are:
>> 1. Why does the interprete not uses a copy of the collection to
>> iterate over it? Are there performance reasons?
>
>
> Because implicit copying would be pointless in most cases.  Most loops don't
> even want to modify the collection, why copy all iterables just in case your
> loop might be one of the tiny few that might change the collection?
>
Okay, I get this point. It is senseless especially in view of memory usage.
But how is the for-each parsed. Is it
> Of course, if that prices is acceptable to you, you could do the copy
> yourself:
>
>     for row in list(self.flows):
>         if row[key] not in value:
>             self.flows.remove(row)
>
>
>> 2. Why is the counter for the iteration not modified?
>
>
> Because the list and the iterator over the list are different objects. I
> suppose the list and the iterator could have been written to update when the
> list is modified, but it could get pretty complicated, even more so if you
> want to do the same for other collections like dictionaries.
>
> The best advice is: don't modify the list, instead make a new list:
>
>     self.flows = [r for r in self.flows if r[key] not in value]
>
How is the list comprehension done by the interpreter?
The list I'm have to work with is not that small. So to avoid
duplicated parts in memory the function looks currently like this:

     def keepByValue(self, key=None, value=[]):
         tmpFlows = []
         while len(self.flows) > 0:
             row = self.flows.pop()
             if row[key] in value:
                 tmpFlows.append(row)
         self.flows = tmpFlows

If there is no duplication in memory, the list comprehension would be
much more elegant.

> Be careful though, since there might be other references to the list, and
> now you have two.
>
> --Ned.
>
>>
>> Valentin
>>
>
>
> --
> https://mail.python.org/mailman/listinfo/python-list

2013/11/28 MRAB <python at mrabarnett.plus.com>:
> On 28/11/2013 17:20, Ned Batchelder wrote:
>>
>> On 11/28/13 10:49 AM, Valentin Zahnd wrote:
>>>
>>> Hello
>>>
>>> For-each does not iterate ober all entries of collection, if one
>>> removes elements during the iteration.
>>>
>>> Example code:
>>>
>>> def keepByValue(self, key=None, value=[]):
>>>      for row in self.flows:
>>>          if not row[key] in value:
>>>              self.flows.remove(row)
>>>
>>> It is clear why it behaves on that way. Every time one removes an
>>> element, the length of the colleciton decreases by one while the
>>> counter of the for each statement is not.
>>> The questions are:
>>> 1. Why does the interprete not uses a copy of the collection to
>>> iterate over it? Are there performance reasons?
>>
>>
>> Because implicit copying would be pointless in most cases.  Most loops
>> don't even want to modify the collection, why copy all iterables just in
>> case your loop might be one of the tiny few that might change the
>> collection?
>>
>> Of course, if that prices is acceptable to you, you could do the copy
>> yourself:
>>
>>       for row in list(self.flows):
>>           if row[key] not in value:
>>               self.flows.remove(row)
>>
>>> 2. Why is the counter for the iteration not modified?
>>
>>
>> Because the list and the iterator over the list are different objects.
>> I suppose the list and the iterator could have been written to update
>> when the list is modified, but it could get pretty complicated, even
>> more so if you want to do the same for other collections like
>> dictionaries.
>>
>> The best advice is: don't modify the list, instead make a new list:
>>
>>       self.flows = [r for r in self.flows if r[key] not in value]
>>
>> Be careful though, since there might be other references to the list,
>> and now you have two.
>>
> The simple solution in that case is:
>
>
>     self.flows[:] = [r for r in self.flows if r[key] not in value]
>
> --
> https://mail.python.org/mailman/listinfo/python-list



More information about the Python-list mailing list