[pypy-issue] Issue #1852: sum(arrays,[]) is slow (pypy/pypy)
Anton Dubrau
issues-reply at bitbucket.org
Thu Aug 28 05:09:21 CEST 2014
New issue 1852: sum(arrays,[]) is slow
https://bitbucket.org/pypy/pypy/issue/1852/sum-arrays-is-slow
Anton Dubrau:
sum(strings,"") is very slow (unnecessarily so), but at least there's the "".join(strings) function.
when appending a list of arrays together in a similar fashion, there's no join method.
Looping over an array and growing a result array using extend is much faster, but still kind of slow (it appears it's superlinear).
The fastest way seems to be to pre-allocate a result array and copy the values into that, but it's kind of cumbersome.
I feel that sum(arrays,[]) (and incidentally sum(strings,"")) is idiomatic, and pypy should specialize on it. It can also easily avoid re-allocations in theory, compared to using extend.
Clearly, the time difference between the different approaches is unreasonable:
some profiling:
def sumExtend(arrays):
result = []
for a in arrays:
result.extend(a)
return result
def sumPreallocate(arrays):
length = sum(len(a) for a in arrays)
result = [0]*length
index = 0;
for a in arrays:
result[index:index+len(a)] = a
index += len(a)
return result
>> arrays = [[i%10]*i for i in range(10000)]
>> %time result = sum(arrays,[])
cancelled after two and half hours ... may take days?
>> %time _ = sumPreallocate(arrays)
Wall time: 588 ms
>> %time _ = sumExtend(arrays)
Wall time: 1min 42s
More information about the pypy-issue
mailing list