removing redundant elements in a dictionary

Antti Rasinen ars at iki.fi
Wed Aug 1 03:15:36 EDT 2007


On 2007-08-01, at 06:50, Astan Chee wrote:

> Hi,
> I have a dictionary which looks something like this:
>
> elem = {"co":[1,2,3],"cp":[3,2,1],"pp":[5,6,7],"cl":[1,2,3],"qw": 
> [6,7,8],"qa":[8,7,6]}
>
> what Im trying to do is find all keys in the list that have the  
> same value and delete those (except one); thereby removing all  
> redundant keys so that the above dict will look like this:
>
> elem = {"co":[1,2,3],"pp":[5,6,7],"qa":[8,7,6]}
>
> my code so far looks like this:
>
> for out1 in elem.keys():
>     for out2 in elem.keys():
>         if out1 != out2:
>             elem[out1].sort()
>             elem[out2].sort()
>             if elem[out1] == elem[out2]:
>                 del elem[out1]
>
> This returns in a KeyError exception when sorting. What am I  
> missing here?

Consider a dict {'a': [1], 'b': [1], 'c': [1]}.

First time we reach the innerloop, out1 == 'a'. Then we iterate over  
the list ['a', 'b', 'c'].

'a' is skipped. At 'b', we remove 'a' from the dictionary. And then  
it all goes wrong.

The last time we go through the inner loop, out1 is still 'a' -- an  
element which does not exist in the dictionary anymore. Then your  
attempts to reference it fail at both the sorting and comparison  
stages. You should break out of the inner loop immediately after  
you've deleted the item or delete out2 instead..

Furthermore, the algorithm is somewhat inefficient due to the fact  
that it sorts the lists O(n^2) times when you only need to do them n  
times. If your data allows it, you might like to use sets instead of  
lists. This would avoid the need to sort the lists.

However, here is a very quick version that works in my tests:
for out_list in elem.values():
     out_list.sort()

for out1 in elem.keys():
     for out2 in elem.keys():
         if out1 != out2 and elem[out1] == elem[out2]:
             del elem[out1]
             break

There are probably several ways to improve on that, however.

-- 
[ ars at iki.fi <*> Antti Rasinen ]

"Pay the man his wage, he's making paper to fuel the Information Age."


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20070801/ea8cbb4b/attachment.html>


More information about the Python-list mailing list