[Python-Dev] counterintuitive behavior (bug?) in Counter with +=

INADA Naoki songofacandy at gmail.com
Mon Oct 3 13:57:45 CEST 2011


+1

Because Counter is mutable object, I think += should mutate left side object.

On Mon, Oct 3, 2011 at 7:12 PM, Lars Buitinck <L.J.Buitinck at uva.nl> wrote:
> Hello,
>
> [First off, I'm not a member of this list, so please Cc: me in a reply!]
>
> I've found some counterintuitive behavior in collections.Counter while
> hacking on the scikit-learn project [1]. I wanted to use a bunch of
> Counters to do some simple term counting in a set of documents,
> roughly as follows:
>
>    count_total = Counter()
>    for doc in documents:
>        count_current = Counter(analyze(doc))
>        count_total += count_current
>        count_per_doc.append(count_current)
>
> Because we target Python 2.5+, I implemented a lightweight replacement
> with just the functionality we need, including __iadd__, but then my
> co-developer ran the above code on Python 2.7 and performance was
> horrible. After some digging, I found out that Counter [2] does not
> have __iadd__ and += copies the entire left-hand side in __add__!
>
> I also figured out that I should use the update method instead, which
> I will, but I still find that uglier than +=. I would submit a patch
> to implement __iadd__, but I first want to know if that's considered
> the right behavior, since it changes the semantics of +=:
>
>    >>> from collections import Counter
>    >>> a = Counter([1,2,3])
>    >>> b = a
>    >>> a += Counter([3,4,5])
>    >>> a is b
>    False
>
> would become
>
>    # snip
>    >>> a is b
>    True
>
> TIA,
> Lars
>
>
> [1] https://github.com/scikit-learn/scikit-learn/commit/de6e93094499e4d81b8e3b15fc66b6b9252945af
> [2] http://hg.python.org/cpython/file/tip/Lib/collections/__init__.py#l399
>
>
> --
> Lars Buitinck
> Scientific programmer, ILPS
> University of Amsterdam
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>



-- 
INADA Naoki  <songofacandy at gmail.com>


More information about the Python-Dev mailing list