[Numpy-discussion] The NumPy Mandelbrot code 16x slower than Fortran

Sun Jan 22 23:39:35 EST 2012

On 22/01/2012 20:01, Ondřej Čertík wrote:
> On Sun, Jan 22, 2012 at 3:13 AM, Sebastian Haase<seb.haase at gmail.com>  wrote:
>> How does the algorithm and timing compare to this one:
>>
>> http://code.google.com/p/priithon/source/browse/Priithon/mandel.py?spec=svna6117f5e81ec00abcfb037f0f9da2937bb2ea47f&r=a6117f5e81ec00abcfb037f0f9da2937bb2ea47f
>>
>> The author of original version is  Dan Goodman
>> # FAST FRACTALS WITH PYTHON AND NUMPY
>
> Thanks Sebastian. This one is much faster ---- 2.7s on my laptop with
> the same dimensions/iterations.
>
> It uses a better datastructures -- it only keeps track of points that
> still need to be iterated --- very clever.
> If I have time, I'll try to provide an equivalent Fortran version too,
> for comparison.

I spent a little while trying to optimise my algorithm using only numpy 
and couldn't get it running much faster than that. Given the relatively 
low number of iterations it's probably not a problem of Python 
overheads, so I guess it is indeed memory access that is the problem. 
One way to get round this using numexpr would be something like this. 
Write f(z)=z^2+c and then f(n+1,z)=f(n,f(z)). Now try out instead of 
computing z->f(z) each iteration, write down the formula for z->f(n,z) 
for a few different n and use that in numexpr, e.g. z->f(2,z) or 
z->(z^2+c)^2+c. This amounts to doing several iterations per step, but 
it means that you'll be spending more time doing floating point ops and 
less time waiting for memory operations so it might get closer to 
fortran/C speeds.

Actually, my curiosity was piqued so I tried it out. On my laptop I get 
that using the idea above gives a maximum speed increase for n=8, and 
after that you start to get overflow errors so it runs slower. At n=8 it 
runs about 4.5x faster than the original version. So if you got the same 
speedup it would be running in about 0.6s compared to your fortran 0.7s. 
However it's not a fair comparison as numexpr is using multiple cores 
(but only about 60% peak on my dual core laptop), but still nice to see 
what can be achieved with numexpr. :)

Code attached.

Dan
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fastermandel.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120123/e1d93a12/attachment.ksh>