Real-world Python code 700 times slower than C
Kragen Sitaker
kragen at pobox.com
Wed Jan 23 18:45:27 EST 2002
In an article no longer on my news server, Fernando Perez wrote:
A quick rewrite in Numeric gives me about a 5x speedup, but
there's still a nasty bottleneck: the malloc() call implicit in
every call to RampNum:
def RampNum(result, size, start, end):
step = (end-start)/(size-1)
result[:] = arange(size)*step + start
There's no easy way to do (that I know of) the in-place operation
in Numeric, a very annoying limitation. Numeric will always
compute a new array on the right hand side, unfortunately (with
the associated allocation).
Well, there are actually three allocations going on there:
- one for arange(size)
- one for the multiplication
- one for the addition
I think you can reduce this to one with the following untested code in
Python 2.x:
result[:] = arange(size)
result *= step
result += start
You can also say:
result[:] = arange(size)
Numeric.multiply(result, step, result)
Numeric.add(result, start, result)
All the binary ufuncs defined in Numeric have a three-argument form in
which the third argument specifies where to put the result. This is
very helpful when you're trying to speed up inner loops of Numpy
programs.
More information about the Python-list
mailing list