strange behaviour of sum()

Jan-Erik Meyer-Lütgens python at meyer-luetgens.de
Wed Nov 19 17:46:07 EST 2003


Ben wrote:
> I'm trying to figure out how how complex map, filter and reduce work
> based on the following piece of code from
> http://www-106.ibm.com/developerworks/linux/library/l-prog.html :
> 
> bigmuls = lambda xs,ys: filter(lambda (x,y):x*y > 25, combine(xs,ys))
> combine = lambda xs,ys: map(None, xs*len(ys), dupelms(ys,len(xs)))
> dupelms = lambda lst,n: reduce(lambda s,t:s+t, map(lambda l,n=n:
> [l]*n, lst))
> print bigmuls((1,2,3,4),(10,15,3,22))
>

Hi all,

I've played with this example, also. I'd rewritten the example
using zip() and sum(), when I've noticed a quirk of the sum() function.

# keep pairs whose product is greater than 25.
bigmuls=lambda xs,ys:filter(lambda (x,y): x*y>25, combine(xs, ys))

# compute the cross product of two lists.
# combine([1,2],[3,4]) --> [(1, 3), (2, 3), (1, 4), (2, 4)]
combine=lambda xs,ys:zip(xs*len(ys), dupelms(ys, len(xs)))

# duplicate elements of lst n times.
# dupelms([1,2], 3) --> [1, 1, 1, 2, 2, 2]
dupelms=lambda lst,n:sum(map(lambda element,n=n:[element]*n, lst),[])


I'd ran into trouble when I've used sum() naively:

     sum([ [1], [2], [3] ])

results in:

     TypeError: unsupported operand type(s) for +: 'int' and 'list'


A peek in the language reference:

sum(sequence[, start])
     Sums start and the items of a sequence, from left to right,
     and returns the total. start defaults to 0. The sequence's
     items are normally numbers, and are not allowed to be strings.
     The fast, correct way to concatenate sequence of strings
     is by calling ''.join(sequence). Note that sum(range(n), m)
     is equivalent to reduce(operator.add, range(n), m)
     New in version 2.3.


As we can see:
   sum(seq, start) is equivalent to reduce(operator.add, seq, start)

but:
   sum(seq) is not equivalent to reduce(operator.add, seq)

because: start of sum() defaults to 0. So we must set the
start value explicit to the neutral element for the addition:

     sum([ [1], [2], [3] ], [])


The other strange behaviour is:

     sum(['my', 'pet', 'fish', 'eric'], '')

results in:

     TypeError: sum() can't sum strings [use ''.join(seq) instead]

If there is a special treatment for strings, why doesn't sum()
use ''.join(seq) itself, instead of telling me that I should
use it? But in fact sum() should call operator.add(), even
for strings:


from types import StringType

class MyString(StringType):
#    __metaclass__ = type

     def __init__(self, value):
         StringType.__init__(self, value)

     def __str__(self):
         return StringType.__str__(self)

     def __add__(self, other):
         return MyString(str(self) + ' ' + str(other))

my   = MyString('my')
pet  = MyString('pet')
fish = MyString('fish')
eric = MyString('eric')

# different output (with intent):
print my + pet + fish + eric         # --> my pet fish erik
print ''.join([my, pet, fish, eric]) # --> mypetfisheric

# two quirks for sum(), here:
# 1. does not work, because MyString is a subclass of string
# 2. result have an extra unwanted space (if it would work)
print sum( [my, pet, fish, eric], MyString('') )


I would treat this quirks as unpythonic, cause the behaviour
of the sum() function is not generic and not intuitive.

Will this be fixed in future releases of python?





More information about the Python-list mailing list