General Numerical Python question

Tue Oct 14 01:58:27 EDT 2003

2mc wrote:

> Michael Ressler <ressler at cheetah.jpl.nasa.gov> wrote in message news:<slrnbolkk2.89o.ressler at cheetah.jpl.nasa.gov>...
> 
>>The real question is - why do you want to run a loop over an array?
>>The whole point of Numeric is that you want to eliminate loops
>>entirely. Keeping things in the array domain is infinitely faster than
>>running explicit loops. You may need to come up with some clever
>>expressions to do it, but most loops can be gotten rid of with clever
>>uses of put(), take(), and the like.
>>
>>Loops are evil.
>>
>>Mike
> 
> 
> For me, the key thought in your post is " you may need to come up with
> some clever expressions to do it, but most loops can be gotten rid of
> with clever uses of put(), take(), and the like."
> 
> This is what I'm looking for.  I'm so used to explicitly declaring
> loops that it is hard for me to "speak" in Numerical Python.
> 
> Suppose I have 2 very large arrays of serial data.  To make it simple,
> let's assume each has 10s of thousands of rows with one column/field
> of data.  Further assume at some point later in the program I am going
> to compare the data in the two arrays - the comparison being on chunks
> of 25 rows throughout the array.  But, before I do that, I have to
> "normalize" the data in both arrays in order to make the comparisons
> valid.
> 
> Assume the way I make the comparisons is to find the size of the range
> between the highest and the lowest value in each 25 row 'chunk' and
> normalize each data point as: (datapoint - lowestvalue) /
> (highestvalue - lowestvalue) * 100.

Something like:

import Numeric as np # Personal preference
chunked = np.reshape(data, (-1, 25)) # Chunked is a n/25 x 25 array
chunk_max = np.maximum.reduce(chunked, 1) # Reduce along axis 1
chunk_min = np.minimum.reduce(chunked, 1)
# NewAxis changes the relevant arrays from shape [n/25] to shape
# [n/25,1]. 1 will get broadcast.
normalized = ((chunked - chunk_min[:,np.NewAxis]) /
               (chunk_max - chunk_min)[:,np.NewAxis]* 100)

> Then assume I want to find the slope of the linear regression through
> each 25 row 'chunk.'  It is this slope that I will ultimately be
> comparing later in the program.

For this you might want to use the LinearAlgebra module (it comes with 
Numeric). I'm not as familiar with the interface for this though, so you 
'll have to check the docs or hope someone else can help you.

Hope that gets you started.

-tim

> This is the kind of programming I was hoping I could use Numerical
> Python for.  It is the syntax of such a program that I'm grappling
> with.  If someone could help me with the above scenario, then I could
> write the program using the real comparisons I want (which are
> considerably more complicated than above).
> 
> Thank you for your kind response.  If you have any comments on the
> above I would appreciate hearing them.
> 
> Matt