Numpy slow at vector cross product?

BartC bc at freeuk.com
Mon Nov 21 13:43:49 EST 2016


On 21/11/2016 17:04, Nobody wrote:
> On Mon, 21 Nov 2016 14:53:35 +0000, BartC wrote:
>
>> Also that the critical bits were not implemented in Python?
>
> That is correct. You'll notice that there aren't any loops in numpy.cross.
> It's just a wrapper around a bunch of vectorised operations (*, -, []).
>
> If you aren't taking advantage of vectorisation, there's no reason to
> expect numpy to be any faster than primitive operations, any more than
> you'd expect
>
> 	(numpy.array([1]) + numpy.array([2])[0]
>
> to be faster than "1+2".
>
> Beyond that, you'd expect a generic function to be at a disadvantage
> compared to a function which makes assumptions about its arguments.
> Given what it does, I wouldn't expect numpy.cross() to be faster for
> individual vectors if it was written in C.

The fastest I can get compiled, native code to do this is at 250 million 
cross-products per second.

The fastest using pure Python executed with Python 2.7 is 0.5 million 
per second.

With pypy, around 8 million per second. (Results will vary by machine, 
version, and OS so this is just one set of timings.)

So numpy, at 0.03 million per second [on a different OS and different 
version], has room for improvement I think!

(In all cases, the loop has been hobbled so that one component 
increments per loop, and one component of the result is summed and then 
displayed at the end.

This is to stop gcc, and partly pypy, from optimising the code out of 
existence; usually you are not calculating the same vector product 
repeatedly. Without the restraint, pypy leaps to 100 million per second, 
and gcc to an infinite number.)

The tests were with values assumed to be vectors, assumed to have 3 
components, and without any messing about with axes, whatever that code 
does. It's just a pure, streamlined, vector cross product (just as I use 
in my own code).

Such a streamlined version can also be written in Python. (Although it 
would be better with a dedicated 3-component vector type rather than a 
general purpose list or even numpy array.)

It's still a puzzle why directly executing the code that numpy uses was 
still faster than numpy itself, when both were run with CPython. Unless 
numpy is perhaps using extra wrappers around numpy.cross.

-- 
Bartc



More information about the Python-list mailing list