increases sets efficiency by an order of infinity

Hunter Peress hu.peress at mail.mcgill.ca
Fri Feb 7 03:14:17 EST 2003


I think theres a very lacking feature in the 2.3 implementation of sets.

You are locked in to value that the set does its "magic" on.

sets are meant to operate on lists. eg: [1,2,3,4] becomes: S([1,2,3,4])

next imagine we take a list of less granular objects eg:
a=sets.Set([[1,'/home/'],[2,'/home/']][3,'/lala']) and
b=sets.Set([[1,'/bin/'],[2,'/home/']])

suppose we make an intersection on these two to find common items.
Well...this will result in the nullset.

This example suggests the limits of the sets.Set

I think there should be a way to specify what the "key" that the sets.Set
will act on. eg imagine we have a and b from above. We'll use this new
feature to set a function that will be used to select the set key. Now,
this will return things that the Set thinks are same (because we are
placing this filter over its "eyes"), but these things are actually
different. Normally, a set operation, like an intersection will return a
list of objects that are the same in each set. But this is not desired
here because only the keys will be the same, yet we are interested in
preserving the entire object. So, I think the least cumbersome way is to
return a list of lists of objects (it wont ever get any deeper or
shallower than this).

eg normally:
>>>a=sets.Set([1,2,4,3]) ; b = sets.Set([4,2]) ; a.intersection(b)
Set([2,4])

Now, in what i propose, the feature is called sets.Set.setkey , it will
take a "selector" function that assumes its input is 1 object. Using a and
b from above we do:

>>>def setKeySel(inp):
...    return(inp[0])
>>>a.setkey(setKeySel)
>>>b.setkey(setKeySel)
>>>a.intersection(b)
Set([ [[1,'/home/'],[1,'/bin/']], [[2,'/home/'],[2,'/home/']] ])

Cool eh? This lets u infinitely extend the set algorithms that have been
developing over the last year. This is very pythonic ; its similar to the
fact that u can set your own sort function for a list.

This will apply neatly to the other basic set method: union (or
symmetric_difference)

It can even work quite nicely on different flavor sets, eg:
>>>a.intersection(sets.Set([1]))
Set([ [[1],[1,'/home/']] ])

or

>>>a.issuperset(Set([1]))
True

Thats that.

Now, in the interest of being comprehensive, here are some extra technical
requirements that are birthed by this synthesis of ideas:

-there might be a want for a new exception to signify that a given item of
a set doesnt work with sets.Set.setkey

-the new Set that is produced by any set operation will inherit the setkey
function from its parent.(this is crucial and especially for the next
step)

-a function to "boil down" a set using its setkey eg doing this on 'a'
from above would yield Set([1,2,3])

-at runtime, its probably best to simply run setkey to obtain the key that
Sets will use to evaluate every time any set method is called. ; however,
for ImmutableSet, a more efficient technique used such that only on an
object's entry to the Set is the setkey function needed.

Hmm, thats all i can think of right now.

Maybe this is even enough to make a PEP?

So i turn the ball over to the public.




More information about the Python-list mailing list