numarray speed question

David M. Cooke cookedm+news at physics.mcmaster.ca
Thu Aug 5 16:35:33 EDT 2004


At some point, grv575 at hotmail.com (grv) wrote:

> squirrel at WPI.EDU (Christopher T King) wrote in
> <Pine.LNX.4.44.0408040845120.27254-100000 at ccc6.wpi.edu>: 
>
>>On Wed, 4 Aug 2004, grv wrote:
>>
>>> So it is supposed to be very fast to have an array of say 5 million 
>>> integers stored in a binary file and do
>>> 
>>> a = numarray.fromfile('filename', (2, 2, 2))
>>> numarray.add(a, 9, a)
>>> 
>>> but how is that faster than reading the entire file into memory and
>>> then having a for loop in C:
>>>    (loop over range) {
>>>       *p++ += 9      }
>>> 
>>> or is that essentially what's going on?
>>
>>That's essentially what's going on ;) The point of numarray isn't to be 
>>hyper-fast, but to be as fast as the equivalent C (or Fortran, or 
>>what-have-you) implementation.  In many cases, it's faster, because 
>>numarray is designed with several speed hacks in mind, but it's nothing 
>>you can't do (without a little work) in C.
>>
>
> Yes but see I'm interested in what speed hacks can actually be done to 
> improve the above code.  I just don't see anything that can iterate and add 
> over that memory region faster.

Well, numarray probably isn't faster for this case (adding a scalar to
a vector). In fact, the relevant numarray code looks like this:

static int add_Float64_vector_scalar(long niter, long ninargs, long noutargs, vo
id **buffers, long *bsizes) {
    long i;
    Float64 *tin1     = (Float64 *) buffers[0];
    Float64 tscalar   = *(Float64 *) buffers[1];
    Float64 *tout    = (Float64 *) buffers[2];
    
    for (i=0; i<niter; i++, tin1++, tout++) {
        *tout = *tin1 + tscalar;
    }
    return 0;
}

What you *do* get with numarray is:

1) transparent handling of byteswapped, misaligned, discontiguous,
   type-mismatched data (say, from a memory-mapped file generated on a
   system with a different byte order as single-precision instead of
   double-precision).

2) ease-of-use. That two lines of python code above is _it_ (except
   for an 'import numarray' statement). Your C code isn't anywhere
   nearly complete enough to use. You would need to add routines to
   read the file, etc.

3) interactive use. You can do all this in the Python command line. If
   you want to multiply instead of add, an up-arrow and some editing
   will do that. With C, you'd have to recompile.

If you need the best possible speed (after doing it in numarray and
finding it isn't fast enough), you can write an extension module to
do that bit in C, or look into scipy.weave for inlining C code, or into
f2py for linking Fortran code to Python.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke
|cookedm(at)physics(dot)mcmaster(dot)ca



More information about the Python-list mailing list