[Numpy-discussion] Comparing NumPy/IDL Performance

Mon Sep 26 10:38:39 EDT 2011

hi Keith,

I do not think that your primary concern should be with this kind of 
speed test at this stage :
1/ rest assured that this sort of tests have been performed in other 
contexts, and you can always do some hard work on high level computing 
languages like IDL and python to improve performance
2/ "early optimization is the root of all evil" (Knuth?)
3/ I believe that your primary motivation is to provide an alternative 
library to a proprietary software. If this is so, then your effort is 
most welcome and I would suggest first to port an interesting but small 
piece of the IDL solar physics lib and then study the path to speed 
improvements on such a concrete use case.

As for your python time_test3, if it is a benchmark code proprietary to 
the IDL codebas, there is no wonder it performs well there! :)
At any rate, I would suggest simplifying your code with ipython :

In [1]: import numpy as np
In [2]: a = np.zeros([512, 512], dtype=np.uint8)
In [3]: a[200:250, 200:250] = 10
In [4]: from scipy import ndimage
In [5]: %timeit ndimage.filters.median_filter(a, size=(5, 5))
10 loops, best of 3: 98 ms per loop

I am not sure what unit is your vertical axis....

best,
Johann

On 09/26/2011 04:19 PM, Keith Hughitt wrote:
> Hi all,
>
> Myself and several colleagues have recently started work on a Python 
> library for solar physics <http://www.sunpy.org/>, in order to provide 
> an alternative to the current mainstay for solar physics 
> <http://www.lmsal.com/solarsoft/>, which is written in IDL.
>
> One of the first steps we have taken is to create a Python port 
> <https://github.com/sunpy/sunpy/blob/master/benchmarks/time_test3.py> 
> of a popular benchmark for IDL (time_test3) which measures performance 
> for a variety of (primarily matrix) operations. In our initial 
> attempt, however, Python performs significantly poorer than IDL for 
> several of the tests. I have attached a graph which shows the results 
> for one machine: the x-axis is the test # being compared, and the 
> y-axis is the time it took to complete the test, in milliseconds. 
> While it is possible that this is simply due to limitations in 
> Python/Numpy, I suspect that this is due at least in part to our lack 
> in familiarity with NumPy and SciPy.
>
> So my question is, does anyone see any places where we are doing 
> things very inefficiently in Python?
>
> In order to try and ensure a fair comparison between IDL and Python 
> there are some things (e.g. the style of timing and output) which we 
> have deliberately chosen to do a certain way. In other cases, however, 
> it is likely that we just didn't know a better method.
>
> Any feedback or suggestions people have would be greatly appreciated. 
> Unfortunately, due to the proprietary nature of IDL, we cannot share 
> the original version of time_test3, but hopefully the comments in 
> time_test3.py will be clear enough.
>
> Thanks!
> Keith
>
> -- 
> This message has been scanned for viruses and
> dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is
> believed to be clean.
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110926/28b05b76/attachment.html>