[Numpy-discussion] The NumPy Mandelbrot code 16x slower than Fortran
Dan Goodman
dg.gmane at thesamovar.net
Sun Jan 22 23:39:35 EST 2012
On 22/01/2012 20:01, Ondřej Čertík wrote:
> On Sun, Jan 22, 2012 at 3:13 AM, Sebastian Haase<seb.haase at gmail.com> wrote:
>> How does the algorithm and timing compare to this one:
>>
>> http://code.google.com/p/priithon/source/browse/Priithon/mandel.py?spec=svna6117f5e81ec00abcfb037f0f9da2937bb2ea47f&r=a6117f5e81ec00abcfb037f0f9da2937bb2ea47f
>>
>> The author of original version is Dan Goodman
>> # FAST FRACTALS WITH PYTHON AND NUMPY
>
> Thanks Sebastian. This one is much faster ---- 2.7s on my laptop with
> the same dimensions/iterations.
>
> It uses a better datastructures -- it only keeps track of points that
> still need to be iterated --- very clever.
> If I have time, I'll try to provide an equivalent Fortran version too,
> for comparison.
I spent a little while trying to optimise my algorithm using only numpy
and couldn't get it running much faster than that. Given the relatively
low number of iterations it's probably not a problem of Python
overheads, so I guess it is indeed memory access that is the problem.
One way to get round this using numexpr would be something like this.
Write f(z)=z^2+c and then f(n+1,z)=f(n,f(z)). Now try out instead of
computing z->f(z) each iteration, write down the formula for z->f(n,z)
for a few different n and use that in numexpr, e.g. z->f(2,z) or
z->(z^2+c)^2+c. This amounts to doing several iterations per step, but
it means that you'll be spending more time doing floating point ops and
less time waiting for memory operations so it might get closer to
fortran/C speeds.
Actually, my curiosity was piqued so I tried it out. On my laptop I get
that using the idea above gives a maximum speed increase for n=8, and
after that you start to get overflow errors so it runs slower. At n=8 it
runs about 4.5x faster than the original version. So if you got the same
speedup it would be running in about 0.6s compared to your fortran 0.7s.
However it's not a fair comparison as numexpr is using multiple cores
(but only about 60% peak on my dual core laptop), but still nice to see
what can be achieved with numexpr. :)
Code attached.
Dan
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fastermandel.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120123/e1d93a12/attachment.ksh>
More information about the NumPy-Discussion
mailing list