numarray speed question

David M. Cooke cookedm+news at physics.mcmaster.ca
Tue Aug 10 01:53:46 EDT 2004


At some point, grv575 at hotmail.com (grv) wrote:

> cookedm+news at physics.mcmaster.ca (David M. Cooke) wrote in 
> <qnkn015ujoh.fsf at arbutus.physics.mcmaster.ca>:
>
>>At some point, grv575 at hotmail.com (grv575) wrote:
>
>>> Heh.  Try timing the example I gave (a += 5) using byteswapped vs.
>>> byteswap().  It's fairly fast to do the byteswap.  If you go the
>>> interpretation way (byteswapped) then all subsequent array operations
>>> are at least an order of magnitude slower (5 million elements test
>>> example).
>>
>>You mean something like
>>a = arange(0, 5000000, type=Float64).byteswapped()
>>a += 5
>>
>>vs.
>>a = arange(0, 5000000, type=Float64)
>>a.byteswap()
>>a += 5
>>
>>? I get the same time for the a+=5 in each case -- and it's only twice
>>as slow as operating on a non-byteswapped version. Note that numarray
>>calls the ufunc add routine with non-byteswapped numbers; it takes a
>>block, orders it correctly, then adds 5 to that, does the byteswap on
>>the result, and stores that back. (You're not making a full copy of
>>the array; just a large enough section at a time to do useful work.)
>
> It must be using some sort of cache for the multiplication.  Seems like on 
> the first run it takes 6 seconds and subsequently .05 seconds for either 
> version.

There is. The ufunc for the addition gets cached, so the first time
takes longer (but not that much???)

>>Maybe what you need is a package designed for *small* arrays ( < 1000).
>>Simple C wrappers; just C doubles and ints, no byteswap, non-aligned.
>>Maybe a fixed number of dimensions. Probably easy to throw something
>>together using Pyrex. Or, wrap blitz++ with boost::python.
>
> I'll check out Numeric first.  Would rather have a drop-in solution (which 
> hopefully will get more optimized in future releases) rather than hacking 
> my own wrappers.  Is it some purist mentality that's keeping numarray from 
> dropping to C code for the time-critical routines?  Or can a lot of the 
> speed issues be attributed to the overhead of using objects for the library 
> (numarray does seem more general)?

It's the object overhead in numarray. The developers moved stuff up to
Python, where it's more flexible to handle. Numeric is faster for
small arrays (say < 3000), but numarray is much better at large
arrays. I have some speed comparisions at
http://arbutus.mcmaster.ca/dmc/numpy/

I did a simple wrapper using Pyrex the other night for a vector of
doubles (it just does addition, so it's not much good :-) It's twice
as fast as Numeric, so I might give it a further try.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca



More information about the Python-list mailing list