Proper deletion of selected items during map iteration in for loop: Thanks to all

Steven D'Aprano steve+comp.lang.python at pearwood.info
Sat Apr 26 22:14:02 EDT 2014


On Sat, 26 Apr 2014 12:25:27 -0700, Charles Hixson wrote:

> On 04/25/2014 10:53 AM, Charles Hixson wrote:
>> What is the proper way to delete selected items during iteration of a
>> map?  What I want to do is:
>>
>> for (k, v) in m.items():
>>    if f(k):
>>       #  do some processing of v and save result elsewhere 
>>       del m[k]
>>
>> But this gives (as should be expected):
>>         RuntimeError: dictionary changed size during iteration
>> In the past I've accumulated the keys to be deleted in a separate list,
>> but this time there are likely to be a large number of them, so is
>> there some better way?
>>
>>
> Going over the various responses, it looks like saving the "to be
> deleted" keys to a list, and then iterating over that to delete is the
> best answer.  


I don't think that there is any one "best" answer that always applies. 
"My dict has three items, and I'll be deleting most of them" is likely to 
have different performance characteristics from "My dict has three 
billion items, and I'll be deleting two [or two billion] of them".

So how much time do you want to spend tuning this for optimum performance 
(or memory use, or programmer time)? Or is "good enough" good enough?

I think the two obviously good enough approaches are:

- save a "to be deleted" list, then delete those keys;

- copy the "not to be deleted" items into a new dict

My intuition is that the second will probably be faster, unless your dict 
is truly monstrously big. Not millions, but tens or hundreds or thousands 
of millions of items, depending on how much memory your computer has.

You've already got a code snippet using the "to be deleted" list, here's 
how I would do the other way:

new = {}
for k, v in m.items():
    if f(k):
        process(v)
    else:
        new[k] = v
m.clear()
m.update(new)
del new


If you don't care about modifying the existing dict, but can afford to 
write in a more functional style, you don't even need to bother doing 
that m.clear(), m.update(new). Just return the new dict, stop using the 
old one (it will be garbage collected), and use the new one.

Oh, in case you're wondering, this will be more efficient than it may 
seem, because the actual data in the dict isn't copied. The only things 
being copied are references to the keys and values, not the keys and 
values themselves.



-- 
Steven D'Aprano
http://import-that.dreamwidth.org/



More information about the Python-list mailing list