numarray speed question
Tim Hochberg
tim.hochberg at ieee.org
Wed Aug 25 20:57:27 EDT 2004
grv575 wrote:
> Is it true that python uses doubles for all it's internal floating
> point arithmetic
Yes.
even if you're using something like numarray's
> Complex32?
No. Numarray is an extension module and can use whatever numeric types
it feels like. Float32 for instance is an array of C floats (assuming
floats are 32 bits on your box, which they almost certainly are).
Is it possible to do single precision ffts in numarray or
> no?
I believe so, but I'm not sure off the top of my head. I recommend that
you ask on numpy-discussion <numpy-discussion at lists.sourceforge.net> or
peek at the implementation. It's possible that all FFTs are done double
precision, but I don't think so.
-tim
>
> cookedm+news at physics.mcmaster.ca (David M. Cooke) wrote in message news:<qnkisbrv8r9.fsf at arbutus.physics.mcmaster.ca>...
>
>>At some point, grv575 at hotmail.com (grv) wrote:
>>
>>
>>>cookedm+news at physics.mcmaster.ca (David M. Cooke) wrote in
>>><qnkn015ujoh.fsf at arbutus.physics.mcmaster.ca>:
>>>
>>>
>>>>At some point, grv575 at hotmail.com (grv575) wrote:
>>
>>
>>
>>>>>Heh. Try timing the example I gave (a += 5) using byteswapped vs.
>>>>>byteswap(). It's fairly fast to do the byteswap. If you go the
>>>>>interpretation way (byteswapped) then all subsequent array operations
>>>>>are at least an order of magnitude slower (5 million elements test
>>>>>example).
>>>>
>>>>You mean something like
>>>>a = arange(0, 5000000, type=Float64).byteswapped()
>>>>a += 5
>>>>
>>>>vs.
>>>>a = arange(0, 5000000, type=Float64)
>>>>a.byteswap()
>>>>a += 5
>>>>
>>>>? I get the same time for the a+=5 in each case -- and it's only twice
>>>>as slow as operating on a non-byteswapped version. Note that numarray
>>>>calls the ufunc add routine with non-byteswapped numbers; it takes a
>>>>block, orders it correctly, then adds 5 to that, does the byteswap on
>>>>the result, and stores that back. (You're not making a full copy of
>>>>the array; just a large enough section at a time to do useful work.)
>>>
>>>It must be using some sort of cache for the multiplication. Seems like on
>>>the first run it takes 6 seconds and subsequently .05 seconds for either
>>>version.
>>
>>There is. The ufunc for the addition gets cached, so the first time
>>takes longer (but not that much???)
>>
>>
>>>>Maybe what you need is a package designed for *small* arrays ( < 1000).
>>>>Simple C wrappers; just C doubles and ints, no byteswap, non-aligned.
>>>>Maybe a fixed number of dimensions. Probably easy to throw something
>>>>together using Pyrex. Or, wrap blitz++ with boost::python.
>>>
>>>I'll check out Numeric first. Would rather have a drop-in solution (which
>>>hopefully will get more optimized in future releases) rather than hacking
>>>my own wrappers. Is it some purist mentality that's keeping numarray from
>>>dropping to C code for the time-critical routines? Or can a lot of the
>>>speed issues be attributed to the overhead of using objects for the library
>>>(numarray does seem more general)?
>>
>>It's the object overhead in numarray. The developers moved stuff up to
>>Python, where it's more flexible to handle. Numeric is faster for
>>small arrays (say < 3000), but numarray is much better at large
>>arrays. I have some speed comparisions at
>>http://arbutus.mcmaster.ca/dmc/numpy/
>>
>>I did a simple wrapper using Pyrex the other night for a vector of
>>doubles (it just does addition, so it's not much good :-) It's twice
>>as fast as Numeric, so I might give it a further try.
More information about the Python-list
mailing list