collections.Counter surprisingly slow

Tue Jul 30 02:51:48 EDT 2013

Stefan Behnel, 30.07.2013 08:39:
> Serhiy Storchaka, 29.07.2013 21:37:
>> 29.07.13 20:19, Ian Kelly написав(ла):
>>> On Mon, Jul 29, 2013 at 5:49 AM, Joshua Landau wrote:
>>>> Also, couldn't Counter just extend from defaultdict?
>>>
>>> It could, but I expect the C helper function in 3.4 will be faster
>>> since it doesn't even need to call __missing__ in the first place.
>>
>> I'm surprised, but the Counter constructor with commented out import of
>> this accelerator is faster (at least for some data).
> 
> Read my post. The accelerator doesn't take the fast path for dicts as
> Counter is only a subtype of dict, not exactly a dict. That means that it
> raises and catches a KeyError exception for each new value that it finds,
> and that is apparently more costly than the overhead of calling get().
> 
> So, my expectation is that it's faster for highly repetitive data and
> slower for mostly unique data.
> 
> Maybe a "fast_dict_lookup" option for the accelerator that forces the fast
> path would fix this. The Counter class, just like many (most?) other
> subtypes of dict, definitely doesn't need the fallback behaviour.

Or rather drop the fallback path completely. It's not worth having code
duplication if it's not predictable up-front (before looking at the data)
if it will help or not.

http://bugs.python.org/issue18594

Stefan