sum and strings
Tim Chase
python.list at tim.thechases.com
Thu Aug 24 12:12:14 EDT 2006
>> Just because something is slow or sub-optimal doesn't mean it
>> should be an error.
>
> that's not an error because it would be "slow or sub-optimal" to add
> custom objects, that's an error because you don't understand how "sum"
> works.
>
> (hint: sum != reduce)
No, clearly sum!=reduce...no dispute there...
so we go ahead and get the sum([q1,q2]) working by specifying a
starting value sum([q1,q2], Q()):
>>> class Q(object):
... def __init__(self, n=0, i=0,j=0,k=0):
... self.n = n
... self.i = i
... self.j = j
... self.k = k
... def __add__(self, other):
... return Q(self.n+other.n,
... self.i+other.i,
... self.j+other.j,
... self.k+other.k)
... def __repr__(self):
... return "<Q(%i,%i,%i,%i)>" % (
... self.n,
... self.i,
... self.j,
... self.k)
...
>>> q1 = Q(1,2,3,5)
>>> q2 = Q(7,11,13,17)
>>> q1+q2
<Q(8,13,16,22)>
>>> sum([q1,q2])
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: unsupported operand type(s) for +: 'int' and 'Q'
>>> sum([q1,q2], Q())
<Q(8,13,16,22)>
Thus, sum seems to work just fine for objects containing an
__add__ method. However, strings contain an __add__ method.
>>> hasattr("", "__add__")
True
yet, using the same pattern...
>>> sum(["hello", "world"], "")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: sum() can't sum strings [use ''.join(seq) instead]
Which seems like an arbitrary prejudice against strings...flying
in the face of python's duck-typing. If it has an __add__
method, duck-typing says you should be able to provide a starting
place and a sequence of things to add to it, and get the sum.
However, a new sum2() function can be created...
>>> def sum2(seq, start=0):
... for item in seq:
... start += item
... return start
...
which does what one would expect the definition of sum() should
be doing behind the scenes.
>>> # generate the expected error, same as above
>>> sum2([q1,q2])
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 3, in sum2
TypeError: unsupported operand type(s) for +=: 'int' and 'Q'
>>> # employ the same solution of a proper starting point
>>> sum2([q1,q2], Q())
<Q(8,13,16,22)>
>>> # do the same thing for strings
>>> sum2(["hello", "world"], "")
'helloworld'
and sum2() works just like sum(), only it happily takes strings
without prejudice.
From help(sum):
"Returns the sum of a sequence of numbers (NOT strings) plus the
value of parameter 'start'. When the sequence is empty, returns
start."
It would be as strange as if enumerate() didn't take strings, and
instead forced you to use some other method for enumerating strings:
>>> for i,c in enumerate("hello"): print i,c
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: enumerate() can't enumerate strings [use
"hello".enumerator() instead]
Why the arbitrary breaking of duck-typing for strings in sum()?
Why make them second-class citizens?
The interpreter is clearly smart enough to recognize when the
condition occurs such that it can throw the error...thus, why not
add a few more smarts and have it simply translate it into
"start+''.join(sequence)" to maintain predictable behavior
according to duck-typing?
-tkc
More information about the Python-list
mailing list