Real-world use of Counter
Peter Otten
__peter__ at web.de
Wed Nov 5 12:34:40 EST 2014
Ethan Furman wrote:
> I'm looking for real-world uses of collections.Counter, specifically to
> see if anyone has been surprised by, or had to spend extra-time debugging,
> issues with the in-place operators.
>
> Background:
>
> Most Python data types will cause a TypeError to be raised if unusable
> types are passed in:
>
> --> {'a': 0}.update(5)
> TypeError: 'int' object is not iterable
>
> --> [1, 2, 3].extend(3.14)
> TypeError: 'float' object is not iterable
>
> --> from collections import Counter
> --> Counter() + [1, 2, 3]
> TypeError: unsupported operand type(s) for +: 'Counter' and 'list'
>
> Most Counter in-place methods also behave this way:
>
> --> c /= [1, 2, 3]
> TypeError: unsupported operand type(s) for /=: 'Counter' and 'list'
>
> However, in the case of a handful of Counter in-place methods (+=, -=, &=,
> and |=), this is what happens instead:
>
> --> c = Counter()
> --> c += [1, 2, 3]
> AttributeError: 'list' object has no attribute 'items'
>
> Even worse (in my opinion) is the case of an empty Counter `and`ed with an
> incompatible type:
>
> --> c &= [1, 2, 3]
> -->
>
> No error is raised at all.
>
> In order to avoid unnecessary code churn (the fix itself is quite simple),
> the maintainer of the collections module wants to know if anybody has
> actually been affected by these inconsistencies, and if so, whether it was
> a minor inconvenience, or a compelling use-case.
>
> So, if this has bitten you, now is the time to speak up! :)
Some more:
>>> Counter(a=1, b=2)
Counter({'b': 2, 'a': 1})
>>> Counter(a=1, b=2, iterable=3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.4/collections/__init__.py", line 462, in __init__
self.update(iterable, **kwds)
File "/usr/lib/python3.4/collections/__init__.py", line 542, in update
_count_elements(self, iterable)
TypeError: 'int' object is not iterable
This is just the first oddity I came up with. I expect that the
possibilities to break Python-coded classes are unlimited, as a result of
the -- for me -- greatest gift of Python, namely: duck-typing. I hate the
sum() implementation, not because I ever plan to "add" strings, but because
it tests against bad style. Consenting adults, where are you?
In particular for the &= fix, would
>>> c = Counter([1, 2])
>>> c &= [1, 2, 3]
>>> c
Counter({1: 1, 2: 1})
still be allowed? If not, are there other places where a sequence might
pretend to be a mapping?
FTR, as I'm not exactly a power-user of Counter, I think I have used only
the initializer with one positional iterable, and most_common(), and thus
never been bitten by any of the above pitfalls.
More information about the Python-list
mailing list