[issue25478] Consider adding a normalize() method to collections.Counter()

Allen Downey report at bugs.python.org
Thu May 17 15:06:00 EDT 2018


Allen Downey <allendowney at gmail.com> added the comment:

I'd like to second Raymond's suggestion.  With just a few additional methods, you could support a useful set of operations.  One possible API:

def scaled(self, factor)
"""Returns a new Counter with all values multiplied by factor."""

def normalized(self, total=1)
"""Returns a new Counter with values normalized so their sum is total."""

def total(self)
"""Returns the sum of the values in the Counter."""

These operations would make it easier to use a Counter as a PMF without subclassing.

I understand two arguments against this proposal

1) If you modify the Counter after normalizing, the result is probably nonsense.

That's true, but it is already the case that some Counter methods don't make sense for some use cases, depending on how you are using the Counter (as a bag, multiset, etc)

So the new features would come with caveats, but I don't think that's fatal.

2) PMF operations are not general enough for core Python; they should be in a stats module.

I think PMFs are used (or would be used) for lots of quick computations that don't require full-fledged stats.

Also, stats libraries tend to focus on analytic distributions; they don't really provide this kind of light-weight empirical PMF.

I think the proposed features have a high ratio of usefulness to implementation effort, without expanding the API unacceptably.


Two thoughts for alternatives/extensions:

1) It might be good to make scaled() available as __mul__, as Peter Norvig suggests.

2) If the argument of scaled() is a mapping type, it might be good to support elementwise scaling.  That would provide an elegant implementation of Raymond's chi-squared example and my inspection paradox example (http://greenteapress.com/thinkstats2/html/thinkstats2004.html#sec33)

Thank you!
Allen

----------
nosy: +Allen Downey

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue25478>
_______________________________________


More information about the Python-bugs-list mailing list