[SciPy-User] [ANN] Bottleneck 0.2
Dag Sverre Seljebotn
dagss at student.matnat.uio.no
Tue Dec 28 08:42:21 EST 2010
On 12/27/2010 09:04 PM, Keith Goodman wrote:
> Bottleneck is a collection of fast NumPy array functions written in Cython.
>
> The second release of Bottleneck is faster, contains more functions,
> and supports more dtypes.
>
Another special case for you if you want: It seems that you could add
the case of "mode='c'" to the array declarations, in the case that the
operation goes along the last axis and arr.flags.c_contiguous == True.
Dag Sverre
> Faster:
> - All functions faster (less overhead) when output is not a scalar
> - Faster nanmean() for 2d, 3d arrays containing NaNs when axis is not None
>
> New functions:
> - nanargmin()
> - nanargmax()
> - nanmedian, 100X faster than SciPy's nanmedian for (100,100) input, axis=0
>
> Enhancements:
> - Added support for float32
> - Fallback to slower, non-Cython functions for unaccelerated ndim/dtype
> - Scipy is no longer a dependency
> - Added support for older versions of NumPy (1.4.1)
> - All functions are now templated for dtype and axis
> - Added a sandbox for prototyping of new Bottleneck functions
> - Rewrote benchmarking code
>
> Breaks from 0.1.0:
> - To run benchmark use bn.bench() instead of bn.benchit()
>
> download
> http://pypi.python.org/pypi/Bottleneck
> docs
> http://berkeleyanalytics.com/bottleneck
> code
> http://github.com/kwgoodman/bottleneck
> mailing list
> http://groups.google.com/group/bottle-neck
> mailing list 2
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
> Bottleneck comes with a benchmark suite that compares the performance
> of the bottleneck functions that have a NumPy/SciPy equivalent. To run
> the benchmark:
>
> >>> bn.bench(mode='fast')
> Bottleneck performance benchmark
> Bottleneck 0.2.0
> Numpy (np) 1.5.1
> Scipy (sp) 0.8.0
> Speed is NumPy or SciPy time divided by Bottleneck time
> NaN means one-third NaNs; axis=0 and float64 are used
> median vs np.median
> 3.59 (10,10)
> 2.43 (1001,1001)
> 2.28 (1000,1000)
> 2.16 (100,100)
> nanmedian vs local copy of sp.stats.nanmedian
> 102.72 (10,10) NaN
> 94.34 (10,10)
> 67.89 (100,100) NaN
> 28.52 (100,100)
> 6.37 (1000,1000) NaN
> 4.41 (1000,1000)
> nanmax vs np.nanmax
> 9.99 (100,100) NaN
> 6.12 (10,10) NaN
> 5.99 (10,10)
> 5.88 (100,100)
> 1.79 (1000,1000) NaN
> 1.76 (1000,1000)
> nanmean vs local copy of sp.stats.nanmean
> 25.95 (100,100) NaN
> 12.85 (100,100)
> 12.26 (10,10) NaN
> 11.89 (10,10)
> 5.15 (1000,1000) NaN
> 3.17 (1000,1000)
> nanstd vs local copy of sp.stats.nanstd
> 16.96 (100,100) NaN
> 15.75 (10,10) NaN
> 15.49 (10,10)
> 9.51 (100,100)
> 3.85 (1000,1000) NaN
> 2.82 (1000,1000)
> nanargmax vs np.nanargmax
> 8.60 (100,100) NaN
> 5.65 (10,10) NaN
> 5.62 (100,100)
> 5.44 (10,10)
> 2.84 (1000,1000) NaN
> 2.58 (1000,1000)
> move_nanmean vs sp.ndimage.convolve1d based function
> window = 5
> 19.52 (10,10) NaN
> 18.55 (10,10)
> 10.56 (100,100) NaN
> 6.67 (100,100)
> 5.19 (1000,1000) NaN
> 4.42 (1000,1000)
>
> Under the hood Bottleneck uses a separate Cython function for each
> combination of ndim, dtype, and axis. A lot of the overhead in
> bn.nanmax(), for example, is in checking that the axis is within
> range, converting non-array data to an array, and selecting the
> function to use to calculate the maximum. You can get rid of the
> overhead by calling the underlying Cython function directly.
>
> Benchmarks for the low-level Cython version of each function:
>
> >>> bn.bench(mode='faster')
> Bottleneck performance benchmark
> Bottleneck 0.2.0
> Numpy (np) 1.5.1
> Scipy (sp) 0.8.0
> Speed is NumPy or SciPy time divided by Bottleneck time
> NaN means one-third NaNs; axis=0 and float64 are used
> median_selector vs np.median
> 15.29 (10,10)
> 14.19 (100,100)
> 8.04 (1001,1001)
> 7.32 (1000,1000)
> nanmedian_selector vs local copy of sp.stats.nanmedian
> 352.08 (10,10) NaN
> 340.27 (10,10)
> 185.56 (100,100) NaN
> 138.81 (100,100)
> 8.21 (1000,1000)
> 8.09 (1000,1000) NaN
> nanmax_selector vs np.nanmax
> 21.54 (10,10) NaN
> 19.98 (10,10)
> 12.65 (100,100) NaN
> 6.82 (100,100)
> 1.79 (1000,1000) NaN
> 1.76 (1000,1000)
> nanmean_selector vs local copy of sp.stats.nanmean
> 41.08 (10,10) NaN
> 39.05 (10,10)
> 31.74 (100,100) NaN
> 15.24 (100,100)
> 5.13 (1000,1000) NaN
> 3.16 (1000,1000)
> nanstd_selector vs local copy of sp.stats.nanstd
> 44.55 (10,10) NaN
> 43.49 (10,10)
> 18.66 (100,100) NaN
> 10.29 (100,100)
> 3.83 (1000,1000) NaN
> 2.82 (1000,1000)
> nanargmax_selector vs np.nanargmax
> 17.91 (10,10) NaN
> 17.00 (10,10)
> 10.56 (100,100) NaN
> 6.50 (100,100)
> 2.85 (1000,1000) NaN
> 2.59 (1000,1000)
> move_nanmean_selector vs sp.ndimage.convolve1d based function
> window = 5
> 55.96 (10,10) NaN
> 50.82 (10,10)
> 11.77 (100,100) NaN
> 6.93 (100,100)
> 5.56 (1000,1000) NaN
> 4.51 (1000,1000)
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
More information about the SciPy-User
mailing list