General Numerical Python question

2mc mcrider at bigfoot.com
Tue Oct 14 09:41:53 EDT 2003


Tim Hochberg <tim.hochberg at ieee.org> wrote in message news:<dgMib.24925$gi2.17821 at fed1read01>...
> > Suppose I have 2 very large arrays of serial data.  To make it simple,
> > let's assume each has 10s of thousands of rows with one column/field
> > of data.  Further assume at some point later in the program I am going
> > to compare the data in the two arrays - the comparison being on chunks
> > of 25 rows throughout the array.  But, before I do that, I have to
> > "normalize" the data in both arrays in order to make the comparisons
> > valid.
> > 
> > Assume the way I make the comparisons is to find the size of the range
> > between the highest and the lowest value in each 25 row 'chunk' and
> > normalize each data point as: (datapoint - lowestvalue) /
> > (highestvalue - lowestvalue) * 100.
> 
> Something like:
> 
> import Numeric as np # Personal preference
> chunked = np.reshape(data, (-1, 25)) # Chunked is a n/25 x 25 array
> chunk_max = np.maximum.reduce(chunked, 1) # Reduce along axis 1
> chunk_min = np.minimum.reduce(chunked, 1)
> # NewAxis changes the relevant arrays from shape [n/25] to shape
> # [n/25,1]. 1 will get broadcast.
> normalized = ((chunked - chunk_min[:,np.NewAxis]) /
>                (chunk_max - chunk_min)[:,np.NewAxis]* 100)
> 
> > Then assume I want to find the slope of the linear regression through
> > each 25 row 'chunk.'  It is this slope that I will ultimately be
> > comparing later in the program.
> 
> For this you might want to use the LinearAlgebra module (it comes with 
> Numeric). I'm not as familiar with the interface for this though, so you 
> 'll have to check the docs or hope someone else can help you.
> 
> 
> Hope that gets you started.
> 
> -tim

Thanks a million.  I appreciate your kind and thoughtful response.  I
have found members of this board to be very prompt with help and very
courteous.  I hope that when I'm a little more savvy with the language
that I may return the favor by posting help for someone else.

May I ask you for a little more help.  The example you gave was very
good and it was something I hadn't thought ot.  However, I need the 25
row "window" to move through the entire array one row at a time.  In
other words each 25 row 'chunk' of data will contain 24 rows of the
previous 'chunk'.  Unless I misunderstood your code, each 'chunk' has
a unique set of rows - there is no overlapping.

Do you have any ideas how I could do this without loops?

Matt




More information about the Python-list mailing list