[Numpy-discussion] Why is numpy.abs so much slower on complex64 than complex128 under windows 32-bit?

Tue Apr 10 10:55:53 EDT 2012

On 10/04/2012 16:36, Francesc Alted wrote:
> In [10]: timeit c = numpy.complex64(numpy.abs(numpy.complex128(b)))
> 100 loops, best of 3: 12.3 ms per loop
>
> In [11]: timeit c = numpy.abs(b)
> 100 loops, best of 3: 8.45 ms per loop
>
> in your windows box and see if they raise similar results?
>
No, the results are somewhat the same as before - ~40ms for the first 
(upcast/downcast) case and ~150ms for the direct case (both *much* 
slower than yours!). This is versus ~28ms for operating directly on 
double precisions.

I'm using numexpr in the end, but this is slower than numpy.abs under linux.

>> In a related note of confusion, the times above are notably (and 
>> consistently) different (shorter) to that I get doing a naive `st = 
>> time.time(); numpy.abs(a); print time.time()-st`. Is this to be expected?
>>
>
> This happens a lot, yes, specially when your code is 
> memory-bottlenecked (a very common situation).  The explanation is 
> simple: when your datasets are small enough to fit in CPU cache, the 
> first time the timing loop runs, it brings all your working set to 
> cache, so the second time the computation is evaluated, the time does 
> not have to fetch data from memory, and by the time you run the loop 
> 10 times or more, you are discarding any memory effect.  However, when 
> you run the loop only once, you are considering the memory fetch time 
> too (which is often much more realistic).
Ah, that makes sense. Thanks!

Cheers,

Henry