[Numpy-discussion] The NumPy Mandelbrot code 16x slower than Fortran

Sat Jan 21 22:55:10 EST 2012

Hi,

I read the Mandelbrot code using NumPy at this page:

http://mentat.za.net/numpy/intro/intro.html

but when I run it, it gives me integer overflows. As such, I have
fixed the code, so that it doesn't overflow here:

https://gist.github.com/1655320

and I have also written an equivalent Fortran program.

You can compare both source codes to see
that that it is pretty much one-to-one translation.
The main idea in the above gist is to take an
algorithm written in NumPy, and translate
it directly to Fortran, without any special
optimizations. So the above is my first try
in Fortran. You can plot the result
using this simple script (you
can also just click on this gist to
see the image there):

https://gist.github.com/1655377

Here are my timings:

               Python  Fortran Speedup
Calculation     12.749  00.784  16.3x
Saving  01.904  01.456  1.3x
Total          14.653   02.240  6.5x

I save the matrices to disk in an ascii format,
so it's quite slow in both cases. The pure computation
is however 16x faster in Fortran (in gfortran,
I didn't even try Intel Fortran, that will probably be
even faster).

As such, I wonder how the NumPy version could be sped up?
I have compiled NumPy with Lapack+Blas from source.

Would anyone be willing to run the NumPy version? Just copy+paste
should do it.

If you want to run the Fortran version, the above gist uses
some of my other modules that I use in my other programs, my goal
was to see how much more complicated the Fortran code gets,
compared to NumPy. As such, I put here

https://gist.github.com/1655350

a single file
with all the dependencies, just compile it like this:

gfortran -fPIC -O3 -march=native -ffast-math -funroll-loops mandelbrot.f90

and run:

$ ./a.out
Iteration 1
Iteration 2
...
Iteration 100
 Saving...
 Times:
 Calculation:  0.74804599999999999
 Saving:   1.3640850000000002
 Total:   2.1121310000000002

Let me know if you figure out something. I think the "mask" thing is
quite slow, but the problem is that it needs to be there, to catch
overflows (and it is there in Fortran as well, see the
"where" statement, which does the same thing). Maybe there is some
other way to write the same thing in NumPy?

Ondrej