How to subclass sets.Set() to change intersection() behavior?

mkppk barnaclejive at gmail.com
Tue Dec 12 21:23:43 EST 2006


I have kind of strange change I'd like to make to the sets.Set()
intersection() method..

Normally, intersection would return items in both s1 and s2 like with
something like this:  s1.intersection(s2)

I want the item matching to be a bit "looser".. that is, items in s2
that match to just the beginning of items in s1 would be included in
the result of intersection().

I do not know how intersection() is implemented, so I just kinda
guessed it might have something to do with how it compares set
elements, probably using __eq__ or __cmp__. SO, I though if I override
these methods, maybe magically that would affect the way intersection
works.. so far, no luck =(

Please take a look at the little example script to try to illustrate
what I would like to happen when using my subclass.. Is my approach
totally wrong, or is there a better way to accomplish this? I am trying
to avoid running through nested loops of lists (see final example).

P.S.
- the lists I am working with are small, like 1-10 items each
- actually, not so concerned witht the items in the resulting set, just
want to know that the two sets have at least one item "in common"
- would welcome any other suggestions that would be FAST




import sets

# the way set intersection normally works
s1=sets.Set(['macys','installment','oil','beans'])
s2=sets.Set(['macy','oil','inst','coffee'])

# prints Set(['oil']), as expected..
print s1.intersection(s2)


# my subclass, mySet - I don't know how to effect the .intersection()
method
# my best guess was to change the __eq__ or maybe the __cmp__ methods??
# for now, mySet does nothing special at all but call the functions
from sets.Set
class mySet(sets.Set):

	def __init__(self,iterable=None):

		sets.Set.__init__(self,iterable)

	def __eq__(self,other):

		# maybe something here?
		return sets.Set.__eq__(self,other)

	def __cmp__(self,other):

		# or maybe something here?
		return sets.Set.__cmp__(self,other)



# the same sets used in previous example
s3=mySet(['macys','installment','oil','beans'])
s4=mySet(['macy','oil','inst','coffee'])

# and, the same result: mySet(['oil'])
print s3.intersection(s4)

#****************************************************************************
# THE RESULT I WOULD LIKE TO GET WOULD LOOK LIKE THIS
# because I want items of s4 to match to the beginning of items in s3
# actually I am not so concerned with the result of intersection, just
want to know there there was
# at least one item in common between the two sets..
#
# mySet(['macy','inst','oil'])
#****************************************************************************



# this is the list implementation I am trying to avoid because I am
under the impression using set  would be faster..(??)
# please let me know if I am wrong about that assumption

L1=['macys','installment','oil','beans']
L2=['macy','oil','inst','coffee']

L3=[]
for x in L1:
	for y in L2:
		if x.startswith(y):
			L3.append(y)
		
# prints ['macy', 'inst', 'oil']
print L3




More information about the Python-list mailing list