[Numpy-discussion] The NumPy Mandelbrot code 16x slower than Fortran
Ondřej Čertík
ondrej.certik at gmail.com
Sat Jan 21 22:55:10 EST 2012
Hi,
I read the Mandelbrot code using NumPy at this page:
http://mentat.za.net/numpy/intro/intro.html
but when I run it, it gives me integer overflows. As such, I have
fixed the code, so that it doesn't overflow here:
https://gist.github.com/1655320
and I have also written an equivalent Fortran program.
You can compare both source codes to see
that that it is pretty much one-to-one translation.
The main idea in the above gist is to take an
algorithm written in NumPy, and translate
it directly to Fortran, without any special
optimizations. So the above is my first try
in Fortran. You can plot the result
using this simple script (you
can also just click on this gist to
see the image there):
https://gist.github.com/1655377
Here are my timings:
Python Fortran Speedup
Calculation 12.749 00.784 16.3x
Saving 01.904 01.456 1.3x
Total 14.653 02.240 6.5x
I save the matrices to disk in an ascii format,
so it's quite slow in both cases. The pure computation
is however 16x faster in Fortran (in gfortran,
I didn't even try Intel Fortran, that will probably be
even faster).
As such, I wonder how the NumPy version could be sped up?
I have compiled NumPy with Lapack+Blas from source.
Would anyone be willing to run the NumPy version? Just copy+paste
should do it.
If you want to run the Fortran version, the above gist uses
some of my other modules that I use in my other programs, my goal
was to see how much more complicated the Fortran code gets,
compared to NumPy. As such, I put here
https://gist.github.com/1655350
a single file
with all the dependencies, just compile it like this:
gfortran -fPIC -O3 -march=native -ffast-math -funroll-loops mandelbrot.f90
and run:
$ ./a.out
Iteration 1
Iteration 2
...
Iteration 100
Saving...
Times:
Calculation: 0.74804599999999999
Saving: 1.3640850000000002
Total: 2.1121310000000002
Let me know if you figure out something. I think the "mask" thing is
quite slow, but the problem is that it needs to be there, to catch
overflows (and it is there in Fortran as well, see the
"where" statement, which does the same thing). Maybe there is some
other way to write the same thing in NumPy?
Ondrej
More information about the NumPy-Discussion
mailing list