[Python-ideas] collections.Counter multiplication

Matthew Ruffalo mmr15 at case.edu
Thu May 30 01:43:44 CEST 2013


On 05/29/2013 06:47 PM, MRAB wrote:
> On 29/05/2013 21:17, James K wrote:
>> It should work like this
>>
>>      >>> from collections import Counter
>>      >>> Counter({'a': 1, 'b': 2}) * 2 # scalar
>>      Counter({'b': 4, 'a': 2})
>>      >>> Counter({'a': 1, 'b': 2}) * Counter({'c': 1, 'b': 2}) #
>> multiplies matching keys
>>      Counter({'b': 4})
>>
>>
>> This is intuitive behavior and therefore should be added. I am unsure
>> about division as dividing by a non-existing key would be a division by
>> 0, although division by a scalar is straightforward.
>>
> Multiplying by scalars I understand, but by another Counter? That just
> feels wrong to me.
>
> For example:
>
> >>> c = Counter("apple": 3, "orange": 5)
> >>> # Double everything.
> >>> c * 2
> Counter("apple": 6, "orange": 10)
>
> Fine, OK.
>
> But what does _this_ mean?
>
> >>> d = Counter("orange": 4, "pear": 2)
> >>> c * d
> ???
>

James K is proposing pairwise multiplication of matching elements, with 
the normal behavior of a missing element having a value of 0.

There's another perfectly reasonable interpretation of the * operator, 
however: the Cartesian product of two multisets.

"""
 >>> from collections import Counter
 >>> from itertools import product
 >>> c = Counter(apple=3, orange=10)
 >>> d = Counter(orange=4, pear=2)
 >>> cd = Counter(product(c.elements(), d.elements())) # c * d
 >>> cd
Counter({('orange', 'orange'): 40, ('orange', 'pear'): 20, ('apple', 
'orange'): 12, ('apple', 'pear'): 6})
"""

It would be nice to define * as the Cartesian product of two sets, also:

"""
 >>> s1 = {'a', 'b', 'c'}
 >>> s2 = {'c', 'd'}
 >>> set(product(s1, s2)) # s1 * s2
{('a', 'd'), ('c', 'c'), ('c', 'd'), ('a', 'c'), ('b', 'd'), ('b', 'c')}
"""

The fact that there are two distinct possibilities for Counter.__mul__ 
seems problematic; these objects have set and arithmetic operations and 
* is meaningful in either context.

Implementing set.__mul__ as a Cartesian product doesn't seem to have any 
obvious drawbacks, though.

MMR...


More information about the Python-ideas mailing list