Intersection of lists/sets -- with a catch
Carl Banks
invalidemail at aerojockey.com
Tue Oct 18 18:10:43 EDT 2005
James Stroud wrote:
> Hello All,
>
> I find myself in this situation from time to time: I want to compare two lists
> of arbitrary objects and (1) find those unique to the first list, (2) find
> those unique to the second list, (3) find those that overlap. But here is the
> catch: comparison is not straight-forward. For example, I will want to
> compare 2 objects based on a set of common attributes. These two objects need
> not be members of the same class, etc. A function might help to illustrate:
>
> def test_elements(element1, element2):
> """
> Returns bool.
> """
> # any evaluation can follow
> return (element1.att_a == element2.att_a) and \
> (element1.att_b == element2.att_b)
[snip]
> Its probably obvious to everyone that this type of task seems perfect for
> sets. However, it does not seem that sets can be used in the following way,
> using a hypothetical "comparator" function. The "comparator" would be
> analagous to a function passed to the list.sort() method. Such a device would
> crush the previous code to the following very straight-forward statements:
>
> some_set = Set(some_list, comparator=test_elements)
> another_set = Set(another_list, comparator=test_elements)
> overlaps = some_set.intersection(another_set)
> unique_some = some_set.difference(another_set)
> unique_another = another_set.difference(some_set)
>
> I am under the personal opinion that such a modification to the set type would
> make it vastly more flexible, if it does not already have this ability.
>
> Any thoughts on how I might accomplish either technique or any thoughts on how
> to make my code more straightforward would be greatly appreciated.
Howabout something like this (untested):
class CmpProxy(object):
def __init__(self,obj):
self.obj = obj
def __eq__(self,other):
return (self.obj.att_a == other.obj.att_b
and self.obj.att_b == other.obj.att_b)
def __hash__(self):
return hash((self.obj.att_a,self.obj.att_b))
set_a = set(CmpProxy(x) for x in list_a)
set_b = set(CmpProxy(y) for y in list_b)
overlaps = [ z.obj for z in set_a.intersection(set.b) ]
Carl Banks
More information about the Python-list
mailing list