bags? 2.5.x?

Dan Stromberg dstromberglists at gmail.com
Mon Jan 21 18:13:35 EST 2008


On Thu, 17 Jan 2008 18:18:53 -0800, Raymond Hettinger wrote:

>> >> I keep wanting something like them - especially bags with something
>> >> akin to set union, intersection and difference.
>>
>> > How about this recepie
>> > <URL:http://www.ubookcase.com/book/Oreilly/
>>
>> The author of the bag class said that he was planning to submit bags
>> for inclusion in 2.5 - is there a particular reason why they didn't go
>> in?
> 
> Three reasons:
> 
> 1. b=collections.defaultdict(int) went a long way towards meeting the
> need to for a fast counter.
> 
> 2. It's still not clear what the best API would be. What should list(b)
> return for b.dict = {'a':3, 'b':0, 'c':-3}? Perhaps, [('a', 3), ('b',
> 0), ('c', -3)] or ['a', 'a', 'a']
> or ['a']
> or ['a', 'b', 'c']
> or raise an Error for the negative entry.

I'd suggest that .keys() return the unique list, and that list() return 
the list of tuples.  Then people can use list comprehensions or map() to 
get what they really need.

It might not be a bad thing to have an optional parameter on __init__ 
that would allow the user to specify if they need negative counts or not; 
so far, I've not needed them though.

> 3. I'm still working on it and am not done yet.
> 

Decent reasons.  :)

Thanks!

Here's a diff to bag.py that helped me.  I'd like to think these meanings 
are common, but not necessarily!

$ diff -b /tmp/bag.py.original /usr/local/lib/bag.py
18a19,58
>       def _difference(lst):
>               left = lst[0]
>               right = lst[1]
>               return max(left - right, 0)
>       _difference = staticmethod(_difference)
>
>       def _relationship(self, other, operator):
>               if isinstance(other, bag):
>                       self_keys = set(self._data.keys())
>                       other_keys = set(other._data.keys())
>                       union_keys = self_keys | other_keys
>                       #print 'union_keys is',union_keys
>                       result = bag()
>                       for element in list(union_keys):
>                               temp = operator([ self[element], other
[element] ])
>                               #print 'operator is', operator
>                               #print 'temp is', temp
>                               result[element] += temp
>                       return result
>               else:
>                       raise NotImplemented
>
>       def union(self, other):
>               return self._relationship(other, sum)
>
>       __or__ = union
>
>       def intersection(self, other):
>               return self._relationship(other, min)
>
>       __and__ = intersection
>
>       def maxunion(self, other):
>               return self._relationship(other, max)
>
>       def difference(self, other):
>               return self._relationship(other, self._difference)
>
>       __sub__ = difference
>




More information about the Python-list mailing list