[Python-Dev] Fwd: summing a bunch of numbers (or "whatevers")

Guido van Rossum guido@python.org
Sun, 20 Apr 2003 20:58:06 -0400


Thanks to all for a good and quick discussion!

I'm swayed by Alex's argument that a simple sum() builtin answers a
lot of recurring questions, so I'd like to add it.

I've never liked reduce() -- in its full generality it causes hard to
understand code, and I'm glad to see sum() remove probably 80% of the
need for it.

I like sum() best as the name -- that's what it's called in other
systems.

I'm not too concerned about the number of builtins (we should
deprecate some anyway to make room for new ones).

I'm not too worried that people will ask for prod() as well.  And if
they do, maybe we can give them that too; there's not much else along
the same lines (bitwise or/and; ha ha ha) so even if the slope may be
a bit slippery, I'm not worried about sliding too far.

I don't think the signature should be extended to match min() and
max() -- min(a, b) serves a real purpose, but sum(a, b) is just a
redundant way of saying a+b, and ditto for sum(a,b,c) etc.

There's a bunch of statistics functions (avg or mean, sdev etc.) that
should go in a statistics package or module together with more
advanced statistics stuff -- it would be a good idea to form a working
group or SIG to design such a thing with an eye towards usability,
power, and avoiding traps for newbies.

Finally, there's the question of what sum() of an empty sequence
should be.  There are several ways to force it: you can write

  sum(L or [0])

(which avoids the cost of copying in sum(L + [0]), or we can give
sum() an optional second argument.  But still, what should sum([]) do?
I'm sure that the newbies who are asking for it would be surprised by
anything except sum([]) == 0, since they probably want to sum a list
of numbers, and occasionally (albeit through a bug in their program
:-) the list will be empty.  But that means that summing a sequence of
strings ends up with a strange end case.  So perhaps raising an
exception for an empty sequence, like min() and max(), is better: "In
the face of ambiguity, refuse the temptation to guess."  An optional
second argument can then be used to specify a starting point for the
summation.  The semantics of this argument should be the same as for
reduce():

  sum(S, x) == sum([x] + list(S))

and hence

  sum(["a", "b"], "x") == "xab"

(A minority view that I can't quite shake off: since the name sum()
strongly suggests it's summing up numbers, sum([]) should be 0 and no
second argument is allowed.  I find using sum() for a sequence of
strings a bit weird anyway, and will probably continue to write
"".join(S) for that case.)

Alex, care to send in your patch?

--Guido van Rossum (home page: http://www.python.org/~guido/)