numarray speed question

Wed Aug 25 20:16:31 EDT 2004

Is it true that python uses doubles for all it's internal floating
point arithmetic even if you're using something like numarray's
Complex32?  Is it possible to do single precision ffts in numarray or
no?

cookedm+news at physics.mcmaster.ca (David M. Cooke) wrote in message news:<qnkisbrv8r9.fsf at arbutus.physics.mcmaster.ca>...
> At some point, grv575 at hotmail.com (grv) wrote:
> 
> > cookedm+news at physics.mcmaster.ca (David M. Cooke) wrote in 
> > <qnkn015ujoh.fsf at arbutus.physics.mcmaster.ca>:
> >
> >>At some point, grv575 at hotmail.com (grv575) wrote:
>  
> >>> Heh.  Try timing the example I gave (a += 5) using byteswapped vs.
> >>> byteswap().  It's fairly fast to do the byteswap.  If you go the
> >>> interpretation way (byteswapped) then all subsequent array operations
> >>> are at least an order of magnitude slower (5 million elements test
> >>> example).
> >>
> >>You mean something like
> >>a = arange(0, 5000000, type=Float64).byteswapped()
> >>a += 5
> >>
> >>vs.
> >>a = arange(0, 5000000, type=Float64)
> >>a.byteswap()
> >>a += 5
> >>
> >>? I get the same time for the a+=5 in each case -- and it's only twice
> >>as slow as operating on a non-byteswapped version. Note that numarray
> >>calls the ufunc add routine with non-byteswapped numbers; it takes a
> >>block, orders it correctly, then adds 5 to that, does the byteswap on
> >>the result, and stores that back. (You're not making a full copy of
> >>the array; just a large enough section at a time to do useful work.)
> >
> > It must be using some sort of cache for the multiplication.  Seems like on 
> > the first run it takes 6 seconds and subsequently .05 seconds for either 
> > version.
> 
> There is. The ufunc for the addition gets cached, so the first time
> takes longer (but not that much???)
> 
> >>Maybe what you need is a package designed for *small* arrays ( < 1000).
> >>Simple C wrappers; just C doubles and ints, no byteswap, non-aligned.
> >>Maybe a fixed number of dimensions. Probably easy to throw something
> >>together using Pyrex. Or, wrap blitz++ with boost::python.
> >
> > I'll check out Numeric first.  Would rather have a drop-in solution (which 
> > hopefully will get more optimized in future releases) rather than hacking 
> > my own wrappers.  Is it some purist mentality that's keeping numarray from 
> > dropping to C code for the time-critical routines?  Or can a lot of the 
> > speed issues be attributed to the overhead of using objects for the library 
> > (numarray does seem more general)?
> 
> It's the object overhead in numarray. The developers moved stuff up to
> Python, where it's more flexible to handle. Numeric is faster for
> small arrays (say < 3000), but numarray is much better at large
> arrays. I have some speed comparisions at
> http://arbutus.mcmaster.ca/dmc/numpy/
> 
> I did a simple wrapper using Pyrex the other night for a vector of
> doubles (it just does addition, so it's not much good :-) It's twice
> as fast as Numeric, so I might give it a further try.