Real-world use of Counter

Peter Otten __peter__ at web.de
Wed Nov 5 12:34:40 EST 2014


Ethan Furman wrote:

> I'm looking for real-world uses of collections.Counter, specifically to
> see if anyone has been surprised by, or had to spend extra-time debugging,
> issues with the in-place operators.
> 
> Background:
> 
> Most Python data types will cause a TypeError to be raised if unusable
> types are passed in:
> 
> --> {'a': 0}.update(5)
> TypeError: 'int' object is not iterable
> 
> --> [1, 2, 3].extend(3.14)
> TypeError: 'float' object is not iterable
> 
> --> from collections import Counter
> --> Counter() + [1, 2, 3]
> TypeError: unsupported operand type(s) for +: 'Counter' and 'list'
> 
> Most Counter in-place methods also behave this way:
> 
> --> c /= [1, 2, 3]
> TypeError: unsupported operand type(s) for /=: 'Counter' and 'list'
> 
> However, in the case of a handful of Counter in-place methods (+=, -=, &=,
> and |=), this is what happens instead:
> 
> --> c = Counter()
> --> c += [1, 2, 3]
> AttributeError: 'list' object has no attribute 'items'
> 
> Even worse (in my opinion) is the case of an empty Counter `and`ed with an
> incompatible type:
> 
> --> c &= [1, 2, 3]
> -->
> 
> No error is raised at all.
> 
> In order to avoid unnecessary code churn (the fix itself is quite simple),
> the maintainer of the collections module wants to know if anybody has
> actually been affected by these inconsistencies, and if so, whether it was
> a minor inconvenience, or a compelling use-case.
> 
> So, if this has bitten you, now is the time to speak up!  :)

Some more:

>>> Counter(a=1, b=2)
Counter({'b': 2, 'a': 1})
>>> Counter(a=1, b=2, iterable=3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.4/collections/__init__.py", line 462, in __init__
    self.update(iterable, **kwds)
  File "/usr/lib/python3.4/collections/__init__.py", line 542, in update
    _count_elements(self, iterable)
TypeError: 'int' object is not iterable

This is just the first oddity I came up with. I expect that the 
possibilities to break Python-coded classes are unlimited, as a result of 
the -- for me -- greatest gift of Python, namely: duck-typing. I hate the 
sum() implementation, not because I ever plan to "add" strings, but because 
it tests against bad style. Consenting adults, where are you?

In particular for the &= fix, would

>>> c = Counter([1, 2])
>>> c &= [1, 2, 3]
>>> c
Counter({1: 1, 2: 1})

still be allowed? If not, are there other places where a sequence might 
pretend to be a mapping?

FTR, as I'm not exactly a power-user of Counter, I think I have used only 
the initializer with one positional iterable, and most_common(), and thus 
never been bitten by any of the above pitfalls.





More information about the Python-list mailing list