Numpy slow at vector cross product?

Mon Nov 21 09:50:16 EST 2016

On Mon, 21 Nov 2016 11:09 pm, BartC wrote:

> On 21/11/2016 02:48, Steve D'Aprano wrote:
[...]
>> However, your code is not a great way of timing code. Timing code is
>> *very* difficult, and can be effected by many things, such as external
>> processes, CPU caches, even the function you use for getting the time.
>> Much of the time you are timing here will be in creating the range(loops)
>> list, especially if loops is big.
> 
> But both loops are the same size. And that overhead can quickly be
> disposed of by measuring empty loops in both cases. (On my machine,
> about 0.006/7 seconds for loops of 100,000.)

No, you cannot make that assumption, not in general. On modern machines, you
cannot assume that the time it takes to execute foo() immediately followed
by bar() is the same as the time it takes to execute foo() and bar()
separately.

Modern machines run multi-tasking operating systems, where there can be
other processes running. Depending on what you use as your timer, you may
be measuring the time that those other processes run. The OS can cache
frequently used pieces of code, which allows it to run faster. The CPU
itself will cache some code.

The shorter the code snippet, the more these complications are relevant. In
this particular case, we can be reasonably sure that the time it takes to
create a list range(10000) and the overhead of the loop is *probably* quite
a small percentage of the time it takes to perform 100000 vector
multiplications. But that's not a safe assumption for all code snippets.

This is why the timeit module exists: to do the right thing when it matters,
so that you don't have to think about whether or not it matters. The timeit
module works really really hard to get good quality, accurate timings,
minimizing any potential overhead.

The timeit module automates a bunch of tricky-to-right best practices for
timing code. Is that a problem?

>> The best way to time small snippets of code is to use the timeit module.
>> Open a terminal or shell (*not* the Python interactive interpreter, the
>> operating system's shell: you should expect a $ or % prompt) and run
>> timeit from that. Copy and paste the following two commands into your
>> shell prompt:
>>
>>
>> python2.7 -m timeit --repeat 5 -s "import numpy as np" \
>> -s "x = np.array([1, 2, 3])" -s "y = np.array([4, 5, 6])" \
>> -- "np.cross(x, y)"
[...]

> Yes, I can see that typing all the code out again, and remembering all
> those options and putting -s, -- and \ in all the right places, is a
> much better way of doing it! Not error prone at all.

Gosh Bart, how did you manage to write that sentence? How did you remember
all those words, and remember to put the punctuation marks in the right
places?

You even used sarcasm! You must be a genius. (Oh look, I can use it too.)

Seriously Bart? You've been a programmer for how many decades, and you can't
work out how to call a command from the shell? This is about working
effectively with your tools, and a basic understanding of the shell is an
essential tool for programmers.

This was a *simple* command. It was a LONG command, but don't be fooled by
the length, and the fact that it went over multiple lines, it was dirt
simple. I'm not saying that every programmer needs to be a greybeard Unix
guru (heaven knows that I'm not!), but they ought to be able to run simple
commands from the command line.

Those who don't are in the same position as carpenters who don't know the
differences between the various kinds of hammer or saws. Sure, you can
still do a lot of work using just one kind of hammer and one kind of saw,
but you'll work better, faster and smarter with the right kind. You don't
use a rip saw to make fine cuts, and you don't use a mash hammer to drive
tacks.

The -m option lets you run a module without knowing the precise location of
the source file. Some of the most useful commands to learn:

python -m unittest ...
python -m doctest ...
python -m timeit ...

The biggest advantage of calling the timeit module is that it will
automatically select the number of iterations you need to run to get good
timing results, without wasting time running excessive loops.

(The timeit module is *significantly* improved in Python 3.6, but even in
older versions its pretty good.)

But if you prefer doing it "old school" from within Python, then:

from timeit import Timer
t = Timer('np.cross(x, y)',  setup="""
import numpy as np
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
""")

# take five measurements of 100000 calls each, and report the fastest
result = min(t.repeat(number=100000, repeat=5))/100000
print(result)  # time in seconds per call

Better?

-- 
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.