[SciPy-dev] [SciPy-user] some benchmark data for numarray, Numeric and scipy-newcore
Travis Oliphant
oliphant.travis at ieee.org
Sun Dec 4 17:57:05 EST 2005
Gerard Vermeulen wrote:
>I took a look at the difference between arange in numarray and scipy:
>in numarray arange is a Python function which dispatches the real work
>to a type dependent C function, whereas in scipy arange does
>all calculations in C doubles, which are cast to the requested type.
>
>This may explain why numarray's arange is 5 times faster than scipy's
>arange on my system (don't ask me why David's results for numarray are
>so slow).
>
>
I looked into that last night and saw that one. We could very easily
add a "fillarray" function to each data type if that optimization is
seen as useful.
I think something should definitely be done so that a cast is not done
everytime. The Arange function could be made much faster, for sure.
The other issue of vector-vector and vector-scalar operations, I'm less
convinced about. Do we really need a whole other class of functions in
the ufunc machinery. If so, I'm inclined to included them in the math
operations for array-scalars, rather than the ufunc machinery.
The major slow-down that does have me wondering whether an algorithm
change (or optimization) is necessary is lines 4 and 7. These are
mixed-type operations which I think are exercising the BUFFER_LOOP
section of the general ufunc code. As the array sizes are larger than
the buffer size (default is 80000 bytes and could be changed), no copy
is made. In Numeric, a copy-cast is done on the entire array which is
the main reason, I think, for its slower performance. In scipy core,
currently, the cast is only done on a filled buffer. Right now, there
are two things happening which could be optimized:
1) even if an array is not misbehaved it is still copied over into a
buffer so that the inner loops are performed on the buffers.
Technically, this is not necessary, but otherwise we would have to
figure out a different way to signal that the inner loop should be
called (right now its when the buffers are filled). Otherwise it would
have to be some combination of filled buffer or the more complicated
notion of (single-striding no longer possible for this array).
2) Items are copied over to the buffer 1 at a time. We should take
advantage of contiguous chunks where we can.
In short, numarray is doing a better job of handling the memory for the
misbehaved cases and we could learn something from that.
-Travis
More information about the SciPy-Dev
mailing list