[Python-Dev] Incorrect length of collections.Counter objects / Multiplicity function

Mark Dickinson dickinsm at gmail.com
Thu May 20 23:18:56 CEST 2010


On Tue, May 18, 2010 at 11:00 PM, Gustavo Narea <me at gustavonarea.net> wrote:
> I've checked the new collections.Counter class and I think I've found a bug:
>
>> >>> from collections import Counter
>> >>> c1 = Counter([1, 2, 1, 3, 2])
>> >>> c2 = Counter([1, 1, 2, 2, 3])
>> >>> c3 = Counter([1, 1, 2, 3])
>> >>> c1 == c2 and c3 not in (c1, c2)
>> True
>> >>> # Perfect, so far. But... There's always a "but":
>> ...
>> >>> len(c1)
>> 3

This is the intended behaviour;  it also agrees with what you get when
you iterate
over a Counter object:

>>> list(c1)
[1, 2, 3]

As I understand it, there are other uses for Counter objects besides
treating them
as multisets;  I think the choices for len() and iter() reflected
those other uses.

> Is this the intended behavior? If so, I'd like to propose a proper multiset
> implementation for the standard library (preferably called "Multiset"; should
> I create a PEP?).

Feel free!  The proposal should probably go to python-list or
python-ideas rather
than here, though.

See also this recent thread on python-list, and in particular the messages
from Raymond Hettinger in that thread:

http://mail.python.org/pipermail/python-list/2010-March/thread.html

Mark


More information about the Python-Dev mailing list