novice question

Michael Spencer mahs at telcopartners.com
Fri Mar 11 19:04:47 EST 2005


Leeds, Mark wrote:
> I have a dictionary grp that has lists
> for each element ( excuse my terminology if it's
> incorrect).
> 
> So gname might be automobiles, finance, construction etc
> and grp[gname] is a list and this list
> has elements that are strings
> such as ["AAA","BBB","AAA","CCC"]
> 
>
> Is there a quick way to loop through
> the grp dictionary and reduce the lists
> so that each only contains unique elements ?
> 
Sure, several options, depending on exactly what you need.

The usual way to represent a collection of unique items is a set:

  >>> group = ["AAA","BBB","AAA","CCC"]
  >>> set(group)
  set(['AAA', 'BBB', 'CCC'])
(Two caveats with this:  1) the items in the list must be 'hashable', 2) the set 
is unordered.  If your items are not hashable (e.g., if your groups contain 
lists, then this problem is more complicated - and the approach described here 
won't work).  If you need to preserve the order then see a recent thread 
"Removing duplicates from a list".

If you need an actual list rather than a set, pass the set back to the list 
constructor:
  >>> list(set(group))
  ['AAA', 'BBB', 'CCC']
  >>>
but note that the order is still unrelated to the order of your original list.

Note also that this is a different list from the one you started with:
  >>> group is list(set(group))
  False

If you want to 'reduce' the original list as you specified then you could use 
the slice operator to copy the contents of the new list back into the original one:
  >>> oldgroup = group
  >>> group[:] = list(set(group))
  >>> oldgroup is group
  True
  >>>


To update your dictionary of groups, simply iterate over the dictionary and 
apply one of the above transformations to each value

For example, you can use:
  >>>
  >>> for k in d:
  ...     d[k][:] = list(set(d[k]))
  ...
  >>> d
  {1: ['AAA', 'BBB', 'CCC'], 2: ['AAA', 'BBB', 'CCC']}
  >>>


or, in Python 2.4 could use:

  >>> d.update((k, list(set(d[k]))) for k in d)
(The argument of d.update is a 'generator expression')

but note that this doesn't preserve the identity of the lists

HTH

Michael






More information about the Python-list mailing list