[SciPy-User] re[SciPy-user] moving for loops...

Thu Jun 10 14:56:59 EDT 2010

Well, let's try a more direct example.  I am going to create a 4d array of
random values to illustrate.  I know the length of the dimensions won't be
exactly the same as yours, but the example will still be valid.

In this example, I will be able to calculate *all* of the monthly averages
for *all* of the variables for *all* of the grid points without a single
loop.

> jules = np.random.random((132, 10, 50, 3))
> print jules.shape
(132, 10, 50, 3)

> jules_5d = np.reshape(jules, (-1, 12) + jules.shape[1:])
> print jules_5d.shape
(11, 12, 10, 50, 3)

> jules_5d = np.ma.masked_array(jules_5d, mask=jules_5d < 0.0)

> jules_means = np.mean(jules_5d, axis=0)
> print jules_means.shape
(12, 10, 50, 3)

voila! This matrix has a mean for each month across all eleven years for
each datapoint in each of the 10 variables at each (I am assuming) level in
the atmosphere.

So, if you want to operate on a subset of your jules matrix (for example,
you need to do special masking for each variable), then you can just work
off of a slice of the original matrix, and many of these same concepts in
this example and the previous example still applies.

Ben Root

On Thu, Jun 10, 2010 at 1:08 PM, mdekauwe <mdekauwe at gmail.com> wrote:

>
> Hi,
>
> No if I am honest I am a little confused how what you are suggesting would
> work. As I see it the array I am trying to average from has dims
> jules[(numyears * nummonths),1,numpts,0]. Where the first dimension (132)
> is
> 12 months x 11 years. And as I said before I would like to average the jan
> from the first, second, third years etc. Then the same for the feb and so
> on.
>
> So I don't see how you get to your 2d array that you mention in the first
> line? I thought what you were suggesting was I could precompute the step
> that builds the index for the months e.g
>
> mth_index = np.zeros(0)
> for month in xrange(nummonths):
>     mth_index = np.append(mth_index, np.arange(month, numyears * nummonths,
> nummonths))
>
> and use this as my index to skip the for loop. Though I still have a for
> loop I guess!
>
>
>
>
>
>
> Benjamin Root-2 wrote:
> >
> > Correction for me as well.  To mask out the negative values, use masked
> > arrays.  So we will turn jules_2d into a masked array (second line), then
> > all subsequent commands will still work as expected.  It is very similar
> > to
> > replacing negative values with nans and using nanmin().
> >
> >> jules_2d = jules.reshape((-1, 12))
> >> jules_2d = np.ma.masked_array(jules_2d, mask=jules_2d < 0.0)
> >> jules_monthly = np.mean(jules_2d, axis=0)
> >> print jules_monthly.shape
> >   (12,)
> >
> > Ben Root
> >
> > On Tue, Jun 8, 2010 at 7:49 PM, Benjamin Root <ben.root at ou.edu> wrote:
> >
> >> The np.mod in my example caused the data points to stay within [0, 11]
> in
> >> order to illustrate that these are months.  In my example, months are
> >> column, years are rows.  In your desired output, months are rows and
> >> years
> >> are columns.  It makes very little difference which way you have it.
> >>
> >> Anyway, let's imagine that we have a time series of data "jules".  We
> can
> >> easily reshape this like so:
> >>
> >> > jules_2d = jules.reshape((-1, 12))
> >> > jules_monthly = np.mean(jules_2d, axis=0)
> >> > print jules_monthly.shape
> >>   (12,)
> >>
> >> voila!  You have 12 values in jules_monthly which are means for that
> >> month
> >> across all years.
> >>
> >> protip - if you want yearly averages just change the ax parameter in
> >> np.mean():
> >> > jules_yearly = np.mean(jules_2d, axis=1)
> >>
> >> I hope that makes my previous explanation clearer.
> >>
> >> Ben Root
> >>
> >>
> >> On Tue, Jun 8, 2010 at 5:41 PM, mdekauwe <mdekauwe at gmail.com> wrote:
> >>
> >>>
> >>> OK...
> >>>
> >>> but if I do...
> >>>
> >>> In [28]: np.mod(np.arange(nummonths*numyears), nummonths).reshape((-1,
> >>> nummonths))
> >>> Out[28]:
> >>> array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
> >>>       [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
> >>>       [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
> >>>       [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
> >>>        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]])
> >>>
> >>> When really I would be after something like this I think?
> >>>
> >>> array([  0,  12,  24,  36,  48,  60,  72,  84,  96, 108, 120],
> >>>        [  1,  13,  25,  37,  49,  61,  73,  85,  97, 109, 121],
> >>>        [  2,  14,  26,  38,  50,  62,  74,  86,  98, 110, 122]
> >>>        etc, etc
> >>>
> >>> i.e. so for each month jump across the years.
> >>>
> >>> Not quite sure of this example...this is what I currently have which
> >>> does
> >>> seem to work, though I guess not completely efficiently.
> >>>
> >>> for month in xrange(nummonths):
> >>>         tmp = jules[xrange(0, numyears * nummonths, nummonths),VAR,:,0]
> >>>        tmp[tmp < 0.0] = np.nan
> >>>        data[month,:] = np.mean(tmp, axis=0)
> >>>
> >>>
> >>>
> >>>
> >>> Benjamin Root-2 wrote:
> >>> >
> >>> > If you want an average for each month from your timeseries, then the
> >>> > sneaky
> >>> > way would be to reshape your array so that the time dimension is
> split
> >>> > into
> >>> > two (month, year) dimensions.
> >>> >
> >>> > For a 1-D array, this would be:
> >>> >
> >>> >> dataarray = numpy.mod(numpy.arange(36), 12)
> >>> >> print dataarray
> >>> > array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11,  0,  1,  2,
>  3,
> >>>  4,
> >>> >         5,  6,  7,  8,  9, 10, 11,  0,  1,  2,  3,  4,  5,  6,  7,
>  8,
> >>>  9,
> >>> >        10, 11])
> >>> >> datamatrix = dataarray.reshape((-1, 12))
> >>> >> print datamatrix
> >>> > array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
> >>> >        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
> >>> >        [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]])
> >>> >
> >>> > Hope that helps.
> >>> >
> >>> > Ben Root
> >>> >
> >>> >
> >>> > On Fri, May 28, 2010 at 3:28 PM, mdekauwe <mdekauwe at gmail.com>
> wrote:
> >>> >
> >>> >>
> >>> >> OK so I just need to have a quick loop across the 12 months then,
> >>> that
> >>> is
> >>> >> fine, just thought there might have been a sneaky way!
> >>> >>
> >>> >> Really appreciated, getting there slowly!
> >>> >>
> >>> >>
> >>> >>
> >>> >> josef.pktd wrote:
> >>> >> >
> >>> >> > On Fri, May 28, 2010 at 4:14 PM, mdekauwe <mdekauwe at gmail.com>
> >>> wrote:
> >>> >> >>
> >>> >> >> ok - something like this then...but how would i get the index for
> >>> the
> >>> >> >> month
> >>> >> >> for the data array (where month is 0, 1, 2, 4 ... 11)?
> >>> >> >>
> >>> >> >> data[month,:] = array[xrange(0, numyears * nummonths,
> >>> >> nummonths),VAR,:,0]
> >>> >> >
> >>> >> > you would still need to start at the right month
> >>> >> > data[month,:] = array[xrange(month, numyears * nummonths,
> >>> >> > nummonths),VAR,:,0]
> >>> >> > or
> >>> >> > data[month,:] = array[month: numyears * nummonths :
> >>> nummonths),VAR,:,0]
> >>> >> >
> >>> >> > an alternative would be a reshape with an extra month dimension
> and
> >>> >> > then sum only once over the year axis. this might be faster but
> >>> >> > trickier to get the correct reshape .
> >>> >> >
> >>> >> > Josef
> >>> >> >
> >>> >> >>
> >>> >> >> and would that be quicker than making an array months...
> >>> >> >>
> >>> >> >> months = np.arange(numyears * nummonths)
> >>> >> >>
> >>> >> >> and you that instead like you suggested x[start:end:12,:]?
> >>> >> >>
> >>> >> >> Many thanks again...
> >>> >> >>
> >>> >> >>
> >>> >> >> josef.pktd wrote:
> >>> >> >>>
> >>> >> >>> On Fri, May 28, 2010 at 3:53 PM, mdekauwe <mdekauwe at gmail.com>
> >>> wrote:
> >>> >> >>>>
> >>> >> >>>> Ok thanks...I'll take a look.
> >>> >> >>>>
> >>> >> >>>> Back to my loops issue. What if instead this time I wanted to
> >>> take
> >>> >> an
> >>> >> >>>> average so every march in 11 years, is there a quicker way to
> go
> >>> >> about
> >>> >> >>>> doing
> >>> >> >>>> that than my current method?
> >>> >> >>>>
> >>> >> >>>> nummonths = 12
> >>> >> >>>> numyears = 11
> >>> >> >>>>
> >>> >> >>>> for month in xrange(nummonths):
> >>> >> >>>>    for i in xrange(numpts):
> >>> >> >>>>        for ym in xrange(month, numyears * nummonths,
> nummonths):
> >>> >> >>>>            data[month, i] += array[ym, VAR, land_pts_index[i],
> >>> 0]
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> x[start:end:12,:] gives you every 12th row of an array x
> >>> >> >>>
> >>> >> >>> something like this should work to get rid of the inner loop, or
> >>> you
> >>> >> >>> could directly put
> >>> >> >>> range(month, numyears * nummonths, nummonths) into the array
> >>> instead
> >>> >> >>> of ym and sum()
> >>> >> >>>
> >>> >> >>> Josef
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>>
> >>> >> >>>> so for each point in the array for a given month i am jumping
> >>> >> through
> >>> >> >>>> and
> >>> >> >>>> getting the next years month and so on, summing it.
> >>> >> >>>>
> >>> >> >>>> Thanks...
> >>> >> >>>>
> >>> >> >>>>
> >>> >> >>>> josef.pktd wrote:
> >>> >> >>>>>
> >>> >> >>>>> On Wed, May 26, 2010 at 5:03 PM, mdekauwe <mdekauwe at gmail.com
> >
> >>> >> wrote:
> >>> >> >>>>>>
> >>> >> >>>>>> Could you possibly if you have time explain further your
> >>> comment
> >>> >> re
> >>> >> >>>>>> the
> >>> >> >>>>>> p-values, your suggesting I am misusing them?
> >>> >> >>>>>
> >>> >> >>>>> Depends on your use and interpretation
> >>> >> >>>>>
> >>> >> >>>>> test statistics, p-values are random variables, if you look at
> >>> >> several
> >>> >> >>>>> tests at the same time, some p-values will be large just by
> >>> chance.
> >>> >> >>>>> If, for example you just look at the largest test statistic,
> >>> then
> >>> >> the
> >>> >> >>>>> distribution for the max of several test statistics is not the
> >>> same
> >>> >> as
> >>> >> >>>>> the distribution for a single test statistic
> >>> >> >>>>>
> >>> >> >>>>> http://en.wikipedia.org/wiki/Multiple_comparisons
> >>> >> >>>>>
> http://www.itl.nist.gov/div898/handbook/prc/section4/prc47.htm
> >>> >> >>>>>
> >>> >> >>>>> we also just had a related discussion for ANOVA post-hoc tests
> >>> on
> >>> >> the
> >>> >> >>>>> pystatsmodels group.
> >>> >> >>>>>
> >>> >> >>>>> Josef
> >>> >> >>>>>>
> >>> >> >>>>>> Thanks.
> >>> >> >>>>>>
> >>> >> >>>>>>
> >>> >> >>>>>> josef.pktd wrote:
> >>> >> >>>>>>>
> >>> >> >>>>>>> On Sat, May 22, 2010 at 6:21 AM, mdekauwe
> >>> <mdekauwe at gmail.com>
> >>> >> >>>>>>> wrote:
> >>> >> >>>>>>>>
> >>> >> >>>>>>>> Sounds like I am stuck with the loop as I need to do the
> >>> >> comparison
> >>> >> >>>>>>>> for
> >>> >> >>>>>>>> each
> >>> >> >>>>>>>> pixel of the world and then I have a basemap function call
> >>> which
> >>> >> I
> >>> >> >>>>>>>> guess
> >>> >> >>>>>>>> slows it down further...hmm
> >>> >> >>>>>>>
> >>> >> >>>>>>> I don't see much that could be done differently, after a
> >>> brief
> >>> >> look.
> >>> >> >>>>>>>
> >>> >> >>>>>>> stats.pearsonr could be replaced by an array version using
> >>> >> directly
> >>> >> >>>>>>> the formula for correlation even with nans. wilcoxon looks
> >>> slow,
> >>> >> and
> >>> >> >>>>>>> I
> >>> >> >>>>>>> never tried or seen a faster version.
> >>> >> >>>>>>>
> >>> >> >>>>>>> just a reminder, the p-values are for a single test, when
> you
> >>> >> have
> >>> >> >>>>>>> many of them, then they don't have the right size/confidence
> >>> >> level
> >>> >> >>>>>>> for
> >>> >> >>>>>>> an overall or joint test. (some packages report a Bonferroni
> >>> >> >>>>>>> correction in this case)
> >>> >> >>>>>>>
> >>> >> >>>>>>> Josef
> >>> >> >>>>>>>
> >>> >> >>>>>>>
> >>> >> >>>>>>>>
> >>> >> >>>>>>>> i.e.
> >>> >> >>>>>>>>
> >>> >> >>>>>>>> def compareSnowData(jules_var):
> >>> >> >>>>>>>>    # Extract the 11 years of snow data and return
> >>> >> >>>>>>>>    outrows = 180
> >>> >> >>>>>>>>    outcols = 360
> >>> >> >>>>>>>>    numyears = 11
> >>> >> >>>>>>>>    nummonths = 12
> >>> >> >>>>>>>>
> >>> >> >>>>>>>>    # Read various files
> >>> >> >>>>>>>>    fname="world_valid_jules_pts.ascii"
> >>> >> >>>>>>>>    (numpts, land_pts_index, latitude, longitude, rows,
> cols)
> >>> =
> >>> >> >>>>>>>> jo.read_land_points_ascii(fname, 1.0)
> >>> >> >>>>>>>>
> >>> >> >>>>>>>>    fname = "globalSnowRun_1985_96.GSWP2.nsmax0.mon.gra"
> >>> >> >>>>>>>>    jules_data1 = jo.readJulesOutBinary(fname,
> numrows=15238,
> >>> >> >>>>>>>> numcols=1,
> >>> >> >>>>>>>> \
> >>> >> >>>>>>>>                       timesteps=132, numvars=26)
> >>> >> >>>>>>>>    fname = "globalSnowRun_1985_96.GSWP2.nsmax3.mon.gra"
> >>> >> >>>>>>>>    jules_data2 = jo.readJulesOutBinary(fname,
> numrows=15238,
> >>> >> >>>>>>>> numcols=1,
> >>> >> >>>>>>>> \
> >>> >> >>>>>>>>                       timesteps=132, numvars=26)
> >>> >> >>>>>>>>
> >>> >> >>>>>>>>    # grab some space
> >>> >> >>>>>>>>    data1_snow = np.zeros((nummonths * numyears, numpts),
> >>> >> >>>>>>>> dtype=np.float32)
> >>> >> >>>>>>>>    data2_snow = np.zeros((nummonths * numyears, numpts),
> >>> >> >>>>>>>> dtype=np.float32)
> >>> >> >>>>>>>>    pearsonsr_snow = np.ones((outrows, outcols),
> >>> >> dtype=np.float32)
> >>> >> *
> >>> >> >>>>>>>> np.nan
> >>> >> >>>>>>>>    wilcoxStats_snow = np.ones((outrows, outcols),
> >>> >> dtype=np.float32)
> >>> >> >>>>>>>> *
> >>> >> >>>>>>>> np.nan
> >>> >> >>>>>>>>
> >>> >> >>>>>>>>    # extract the data
> >>> >> >>>>>>>>    data1_snow = jules_data1[:,jules_var,:,0]
> >>> >> >>>>>>>>    data2_snow = jules_data2[:,jules_var,:,0]
> >>> >> >>>>>>>>    data1_snow = np.where(data1_snow < 0.0, np.nan,
> >>> data1_snow)
> >>> >> >>>>>>>>    data2_snow = np.where(data2_snow < 0.0, np.nan,
> >>> data2_snow)
> >>> >> >>>>>>>>    #for month in xrange(numyears * nummonths):
> >>> >> >>>>>>>>    #    for i in xrange(numpts):
> >>> >> >>>>>>>>    #        data1 =
> >>> >> >>>>>>>> jules_data1[month,jules_var,land_pts_index[i],0]
> >>> >> >>>>>>>>    #        data2 =
> >>> >> >>>>>>>> jules_data2[month,jules_var,land_pts_index[i],0]
> >>> >> >>>>>>>>    #        if data1 >= 0.0:
> >>> >> >>>>>>>>    #            data1_snow[month,i] = data1
> >>> >> >>>>>>>>    #        else:
> >>> >> >>>>>>>>    #            data1_snow[month,i] = np.nan
> >>> >> >>>>>>>>    #        if data2 > 0.0:
> >>> >> >>>>>>>>    #            data2_snow[month,i] = data2
> >>> >> >>>>>>>>    #        else:
> >>> >> >>>>>>>>    #            data2_snow[month,i] = np.nan
> >>> >> >>>>>>>>
> >>> >> >>>>>>>>    # exclude any months from *both* arrays where we have
> >>> dodgy
> >>> >> >>>>>>>> data,
> >>> >> >>>>>>>> else
> >>> >> >>>>>>>> we
> >>> >> >>>>>>>>    # can't do the correlations correctly!!
> >>> >> >>>>>>>>    data1_snow = np.where(np.isnan(data2_snow), np.nan,
> >>> >> data1_snow)
> >>> >> >>>>>>>>    data2_snow = np.where(np.isnan(data1_snow), np.nan,
> >>> >> data1_snow)
> >>> >> >>>>>>>>
> >>> >> >>>>>>>>    # put data on a regular grid...
> >>> >> >>>>>>>>    print 'regridding landpts...'
> >>> >> >>>>>>>>    for i in xrange(numpts):
> >>> >> >>>>>>>>        # exclude the NaN, note masking them doesn't work in
> >>> the
> >>> >> >>>>>>>> stats
> >>> >> >>>>>>>> func
> >>> >> >>>>>>>>        x = data1_snow[:,i]
> >>> >> >>>>>>>>        x = x[np.isfinite(x)]
> >>> >> >>>>>>>>        y = data2_snow[:,i]
> >>> >> >>>>>>>>        y = y[np.isfinite(y)]
> >>> >> >>>>>>>>
> >>> >> >>>>>>>>        # r^2
> >>> >> >>>>>>>>        # exclude v.small arrays, i.e. we need just less
> over
> >>> 4
> >>> >> >>>>>>>> years
> >>> >> >>>>>>>> of
> >>> >> >>>>>>>> data
> >>> >> >>>>>>>>        if len(x) and len(y) > 50:
> >>> >> >>>>>>>>            pearsonsr_snow[((180-1)-(rows[i]-1)),cols[i]-1]
> =
> >>> >> >>>>>>>> (stats.pearsonr(x, y)[0])**2
> >>> >> >>>>>>>>
> >>> >> >>>>>>>>        # wilcox signed rank test
> >>> >> >>>>>>>>        # make sure we have enough samples to do the test
> >>> >> >>>>>>>>        d = x - y
> >>> >> >>>>>>>>        d = np.compress(np.not_equal(d,0), d ,axis=-1) #
> Keep
> >>> all
> >>> >> >>>>>>>> non-zero
> >>> >> >>>>>>>> differences
> >>> >> >>>>>>>>        count = len(d)
> >>> >> >>>>>>>>        if count > 10:
> >>> >> >>>>>>>>            z, pval = stats.wilcoxon(x, y)
> >>> >> >>>>>>>>            # only map out sign different data
> >>> >> >>>>>>>>            if pval < 0.05:
> >>> >> >>>>>>>>
> >>>  wilcoxStats_snow[((180-1)-(rows[i]-1)),cols[i]-1]
> >>> >> =
> >>> >> >>>>>>>> np.mean(x - y)
> >>> >> >>>>>>>>
> >>> >> >>>>>>>>    return (pearsonsr_snow, wilcoxStats_snow)
> >>> >> >>>>>>>>
> >>> >> >>>>>>>>
> >>> >> >>>>>>>> josef.pktd wrote:
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>> On Fri, May 21, 2010 at 10:14 PM, mdekauwe <
> >>> mdekauwe at gmail.com>
> >>> >> >>>>>>>>> wrote:
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>> Also I then need to remap the 2D array I make onto
> another
> >>> >> grid
> >>> >> >>>>>>>>>> (the
> >>> >> >>>>>>>>>> world in
> >>> >> >>>>>>>>>> this case). Which again I had am doing with a loop (note
> >>> >> numpts
> >>> >> >>>>>>>>>> is
> >>> >> >>>>>>>>>> a
> >>> >> >>>>>>>>>> lot
> >>> >> >>>>>>>>>> bigger than my example above).
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>> wilcoxStats_snow = np.ones((outrows, outcols),
> >>> >> dtype=np.float32)
> >>> >> >>>>>>>>>> *
> >>> >> >>>>>>>>>> np.nan
> >>> >> >>>>>>>>>> for i in xrange(numpts):
> >>> >> >>>>>>>>>>        # exclude the NaN, note masking them doesn't work
> >>> in
> >>> >> the
> >>> >> >>>>>>>>>> stats
> >>> >> >>>>>>>>>> func
> >>> >> >>>>>>>>>>        x = data1_snow[:,i]
> >>> >> >>>>>>>>>>        x = x[np.isfinite(x)]
> >>> >> >>>>>>>>>>        y = data2_snow[:,i]
> >>> >> >>>>>>>>>>        y = y[np.isfinite(y)]
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>>        # wilcox signed rank test
> >>> >> >>>>>>>>>>        # make sure we have enough samples to do the test
> >>> >> >>>>>>>>>>        d = x - y
> >>> >> >>>>>>>>>>        d = np.compress(np.not_equal(d,0), d ,axis=-1) #
> >>> Keep
> >>> >> all
> >>> >> >>>>>>>>>> non-zero
> >>> >> >>>>>>>>>> differences
> >>> >> >>>>>>>>>>        count = len(d)
> >>> >> >>>>>>>>>>        if count > 10:
> >>> >> >>>>>>>>>>            z, pval = stats.wilcoxon(x, y)
> >>> >> >>>>>>>>>>            # only map out sign different data
> >>> >> >>>>>>>>>>            if pval < 0.05:
> >>> >> >>>>>>>>>>
> >>> >> wilcoxStats_snow[((180-1)-(rows[i]-1)),cols[i]-1]
> >>> >> >>>>>>>>>> =
> >>> >> >>>>>>>>>> np.mean(x - y)
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>> Now I think I can push the data in one move into the
> >>> >> >>>>>>>>>> wilcoxStats_snow
> >>> >> >>>>>>>>>> array
> >>> >> >>>>>>>>>> by removing the index,
> >>> >> >>>>>>>>>> but I can't see how I will get the individual x and y pts
> >>> for
> >>> >> >>>>>>>>>> each
> >>> >> >>>>>>>>>> array
> >>> >> >>>>>>>>>> member correctly without the loop, this was my attempt
> >>> which
> >>> >> of
> >>> >> >>>>>>>>>> course
> >>> >> >>>>>>>>>> doesn't work!
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>> x = data1_snow[:,:]
> >>> >> >>>>>>>>>> x = x[np.isfinite(x)]
> >>> >> >>>>>>>>>> y = data2_snow[:,:]
> >>> >> >>>>>>>>>> y = y[np.isfinite(y)]
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>> # r^2
> >>> >> >>>>>>>>>> # exclude v.small arrays, i.e. we need just less over 4
> >>> years
> >>> >> of
> >>> >> >>>>>>>>>> data
> >>> >> >>>>>>>>>> if len(x) and len(y) > 50:
> >>> >> >>>>>>>>>>    pearsonsr_snow[((180-1)-(rows-1)),cols-1] =
> >>> >> (stats.pearsonr(x,
> >>> >> >>>>>>>>>> y)[0])**2
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>> If you want to do pairwise comparisons with
> stats.wilcoxon,
> >>> >> then
> >>> >> >>>>>>>>> you
> >>> >> >>>>>>>>> might be stuck with the loop, since wilcoxon takes only
> two
> >>> 1d
> >>> >> >>>>>>>>> arrays
> >>> >> >>>>>>>>> at a time (if I read the help correctly).
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>> Also the presence of nans might force the use a loop.
> >>> >> stats.mstats
> >>> >> >>>>>>>>> has
> >>> >> >>>>>>>>> masked array versions, but I didn't see wilcoxon in the
> >>> list.
> >>> >> >>>>>>>>> (Even
> >>> >> >>>>>>>>> when vectorized operations would work with regular arrays,
> >>> nan
> >>> >> or
> >>> >> >>>>>>>>> masked array versions still have to loop in many cases.)
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>> If you have many columns with count <= 10, so that
> wilcoxon
> >>> is
> >>> >> not
> >>> >> >>>>>>>>> calculated then it might be worth to use only array
> >>> operations
> >>> >> up
> >>> >> >>>>>>>>> to
> >>> >> >>>>>>>>> that point. If wilcoxon is calculated most of the time,
> >>> then
> >>> >> it's
> >>> >> >>>>>>>>> not
> >>> >> >>>>>>>>> worth thinking too hard about this.
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>> Josef
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>> thanks.
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>> mdekauwe wrote:
> >>> >> >>>>>>>>>>>
> >>> >> >>>>>>>>>>> Yes as Zachary said index is only 0 to 15237, so both
> >>> methods
> >>> >> >>>>>>>>>>> work.
> >>> >> >>>>>>>>>>>
> >>> >> >>>>>>>>>>> I don't quite get what you mean about slicing with axis
> >
> >>> 3.
> >>> >> Is
> >>> >> >>>>>>>>>>> there
> >>> >> >>>>>>>>>>> a
> >>> >> >>>>>>>>>>> link you can recommend I should read? Does that mean
> >>> given
> >>> I
> >>> >> >>>>>>>>>>> have
> >>> >> >>>>>>>>>>> 4dims
> >>> >> >>>>>>>>>>> that Josef's suggestion would be more advised in this
> >>> case?
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>> There were several discussions on the mailing lists (fancy
> >>> >> slicing
> >>> >> >>>>>>>>> and
> >>> >> >>>>>>>>> indexing). Your case is safe, but if you run in future
> into
> >>> >> funny
> >>> >> >>>>>>>>> shapes, you can look up the details.
> >>> >> >>>>>>>>> when in doubt, I use np.arange(...)
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>> Josef
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>>>>
> >>> >> >>>>>>>>>>> Thanks.
> >>> >> >>>>>>>>>>>
> >>> >> >>>>>>>>>>>
> >>> >> >>>>>>>>>>>
> >>> >> >>>>>>>>>>> josef.pktd wrote:
> >>> >> >>>>>>>>>>>>
> >>> >> >>>>>>>>>>>> On Fri, May 21, 2010 at 10:55 AM, mdekauwe <
> >>> >> mdekauwe at gmail.com>
> >>> >> >>>>>>>>>>>> wrote:
> >>> >> >>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>> Thanks that works...
> >>> >> >>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>> So the way to do it is with np.arange(tsteps)[:,None],
> >>> that
> >>> >> >>>>>>>>>>>>> was
> >>> >> >>>>>>>>>>>>> the
> >>> >> >>>>>>>>>>>>> step
> >>> >> >>>>>>>>>>>>> I
> >>> >> >>>>>>>>>>>>> was struggling with, so this forms a 2D array which
> >>> >> replaces
> >>> >> >>>>>>>>>>>>> the
> >>> >> >>>>>>>>>>>>> the
> >>> >> >>>>>>>>>>>>> two
> >>> >> >>>>>>>>>>>>> for
> >>> >> >>>>>>>>>>>>> loops? Do I have that right?
> >>> >> >>>>>>>>>>>>
> >>> >> >>>>>>>>>>>> Yes, but as Zachary showed, if you need the full index
> >>> in
> >>> a
> >>> >> >>>>>>>>>>>> dimension,
> >>> >> >>>>>>>>>>>> then you can use slicing. It might be faster.
> >>> >> >>>>>>>>>>>> And a warning, mixing slices and index arrays with 3 or
> >>> more
> >>> >> >>>>>>>>>>>> dimensions can have some surprise switching of axes.
> >>> >> >>>>>>>>>>>>
> >>> >> >>>>>>>>>>>> Josef
> >>> >> >>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>> A lot quicker...!
> >>> >> >>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>> Martin
> >>> >> >>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>> josef.pktd wrote:
> >>> >> >>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>> On Fri, May 21, 2010 at 8:59 AM, mdekauwe
> >>> >> >>>>>>>>>>>>>> <mdekauwe at gmail.com>
> >>> >> >>>>>>>>>>>>>> wrote:
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>> Hi,
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>> I am trying to extract data from a 4D array and
> store
> >>> it
> >>> >> in
> >>> >> >>>>>>>>>>>>>>> a
> >>> >> >>>>>>>>>>>>>>> 2D
> >>> >> >>>>>>>>>>>>>>> array,
> >>> >> >>>>>>>>>>>>>>> but
> >>> >> >>>>>>>>>>>>>>> avoid my current usage of the for loops for speed,
> as
> >>> in
> >>> >> >>>>>>>>>>>>>>> reality
> >>> >> >>>>>>>>>>>>>>> the
> >>> >> >>>>>>>>>>>>>>> arrays
> >>> >> >>>>>>>>>>>>>>> sizes are quite big. Could someone also try and
> >>> explain
> >>> >> the
> >>> >> >>>>>>>>>>>>>>> solution
> >>> >> >>>>>>>>>>>>>>> as
> >>> >> >>>>>>>>>>>>>>> well
> >>> >> >>>>>>>>>>>>>>> if they have a spare moment as I am still finding it
> >>> >> quite
> >>> >> >>>>>>>>>>>>>>> difficult
> >>> >> >>>>>>>>>>>>>>> to
> >>> >> >>>>>>>>>>>>>>> get
> >>> >> >>>>>>>>>>>>>>> over the habit of using loops (C convert for my
> >>> sins).
> >>> I
> >>> >> get
> >>> >> >>>>>>>>>>>>>>> that
> >>> >> >>>>>>>>>>>>>>> one
> >>> >> >>>>>>>>>>>>>>> could
> >>> >> >>>>>>>>>>>>>>> precompute the indices's i and j i.e.
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>> i = np.arange(tsteps)
> >>> >> >>>>>>>>>>>>>>> j = np.arange(numpts)
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>> but just can't get my head round how i then use
> >>> them...
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>> Thanks,
> >>> >> >>>>>>>>>>>>>>> Martin
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>> import numpy as np
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>> numpts=10
> >>> >> >>>>>>>>>>>>>>> tsteps = 12
> >>> >> >>>>>>>>>>>>>>> vari = 22
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>> data = np.random.random((tsteps, vari, numpts, 1))
> >>> >> >>>>>>>>>>>>>>> new_data = np.zeros((tsteps, numpts),
> >>> dtype=np.float32)
> >>> >> >>>>>>>>>>>>>>> index = np.arange(numpts)
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>> for i in xrange(tsteps):
> >>> >> >>>>>>>>>>>>>>>    for j in xrange(numpts):
> >>> >> >>>>>>>>>>>>>>>        new_data[i,j] = data[i,5,index[j],0]
> >>> >> >>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>> The index arrays need to be broadcastable against
> each
> >>> >> other.
> >>> >> >>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>> I think this should do it
> >>> >> >>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>> new_data = data[np.arange(tsteps)[:,None], 5,
> >>> >> >>>>>>>>>>>>>> np.arange(numpts),
> >>> >> >>>>>>>>>>>>>> 0]
> >>> >> >>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>> Josef
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>> --
> >>> >> >>>>>>>>>>>>>>> View this message in context:
> >>> >> >>>>>>>>>>>>>>>
> >>> >>
> http://old.nabble.com/removing-for-loops...-tp28633477p28633477.html
> >>> >> >>>>>>>>>>>>>>> Sent from the Scipy-User mailing list archive at
> >>> >> Nabble.com.
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>> _______________________________________________
> >>> >> >>>>>>>>>>>>>>> SciPy-User mailing list
> >>> >> >>>>>>>>>>>>>>> SciPy-User at scipy.org
> >>> >> >>>>>>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>> _______________________________________________
> >>> >> >>>>>>>>>>>>>> SciPy-User mailing list
> >>> >> >>>>>>>>>>>>>> SciPy-User at scipy.org
> >>> >> >>>>>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>> --
> >>> >> >>>>>>>>>>>>> View this message in context:
> >>> >> >>>>>>>>>>>>>
> >>> >>
> http://old.nabble.com/removing-for-loops...-tp28633477p28634924.html
> >>> >> >>>>>>>>>>>>> Sent from the Scipy-User mailing list archive at
> >>> >> Nabble.com.
> >>> >> >>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>> _______________________________________________
> >>> >> >>>>>>>>>>>>> SciPy-User mailing list
> >>> >> >>>>>>>>>>>>> SciPy-User at scipy.org
> >>> >> >>>>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>>>>>>>>>>>>
> >>> >> >>>>>>>>>>>> _______________________________________________
> >>> >> >>>>>>>>>>>> SciPy-User mailing list
> >>> >> >>>>>>>>>>>> SciPy-User at scipy.org
> >>> >> >>>>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>>>>>>>>>>>
> >>> >> >>>>>>>>>>>>
> >>> >> >>>>>>>>>>>
> >>> >> >>>>>>>>>>>
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>> --
> >>> >> >>>>>>>>>> View this message in context:
> >>> >> >>>>>>>>>>
> >>> >>
> http://old.nabble.com/removing-for-loops...-tp28633477p28640656.html
> >>> >> >>>>>>>>>> Sent from the Scipy-User mailing list archive at
> >>> Nabble.com.
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>>> _______________________________________________
> >>> >> >>>>>>>>>> SciPy-User mailing list
> >>> >> >>>>>>>>>> SciPy-User at scipy.org
> >>> >> >>>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>>>>>>>>>
> >>> >> >>>>>>>>> _______________________________________________
> >>> >> >>>>>>>>> SciPy-User mailing list
> >>> >> >>>>>>>>> SciPy-User at scipy.org
> >>> >> >>>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>>
> >>> >> >>>>>>>>
> >>> >> >>>>>>>> --
> >>> >> >>>>>>>> View this message in context:
> >>> >> >>>>>>>>
> >>> >>
> http://old.nabble.com/removing-for-loops...-tp28633477p28642434.html
> >>> >> >>>>>>>> Sent from the Scipy-User mailing list archive at
> Nabble.com.
> >>> >> >>>>>>>>
> >>> >> >>>>>>>> _______________________________________________
> >>> >> >>>>>>>> SciPy-User mailing list
> >>> >> >>>>>>>> SciPy-User at scipy.org
> >>> >> >>>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>>>>>>>
> >>> >> >>>>>>> _______________________________________________
> >>> >> >>>>>>> SciPy-User mailing list
> >>> >> >>>>>>> SciPy-User at scipy.org
> >>> >> >>>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>>>>>>
> >>> >> >>>>>>>
> >>> >> >>>>>>
> >>> >> >>>>>> --
> >>> >> >>>>>> View this message in context:
> >>> >> >>>>>>
> >>> >>
> http://old.nabble.com/removing-for-loops...-tp28633477p28686356.html
> >>> >> >>>>>> Sent from the Scipy-User mailing list archive at Nabble.com.
> >>> >> >>>>>>
> >>> >> >>>>>> _______________________________________________
> >>> >> >>>>>> SciPy-User mailing list
> >>> >> >>>>>> SciPy-User at scipy.org
> >>> >> >>>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>>>>>
> >>> >> >>>>> _______________________________________________
> >>> >> >>>>> SciPy-User mailing list
> >>> >> >>>>> SciPy-User at scipy.org
> >>> >> >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>>>>
> >>> >> >>>>>
> >>> >> >>>>
> >>> >> >>>> --
> >>> >> >>>> View this message in context:
> >>> >> >>>>
> >>> http://old.nabble.com/removing-for-loops...-tp28633477p28711249.html
> >>> >> >>>> Sent from the Scipy-User mailing list archive at Nabble.com.
> >>> >> >>>>
> >>> >> >>>> _______________________________________________
> >>> >> >>>> SciPy-User mailing list
> >>> >> >>>> SciPy-User at scipy.org
> >>> >> >>>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>>>
> >>> >> >>> _______________________________________________
> >>> >> >>> SciPy-User mailing list
> >>> >> >>> SciPy-User at scipy.org
> >>> >> >>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>>
> >>> >> >>>
> >>> >> >>
> >>> >> >> --
> >>> >> >> View this message in context:
> >>> >> >>
> >>> http://old.nabble.com/removing-for-loops...-tp28633477p28711444.html
> >>> >> >> Sent from the Scipy-User mailing list archive at Nabble.com.
> >>> >> >>
> >>> >> >> _______________________________________________
> >>> >> >> SciPy-User mailing list
> >>> >> >> SciPy-User at scipy.org
> >>> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >>
> >>> >> > _______________________________________________
> >>> >> > SciPy-User mailing list
> >>> >> > SciPy-User at scipy.org
> >>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >> >
> >>> >> >
> >>> >>
> >>> >> --
> >>> >> View this message in context:
> >>> >>
> http://old.nabble.com/removing-for-loops...-tp28633477p28711581.html
> >>> >> Sent from the Scipy-User mailing list archive at Nabble.com.
> >>> >>
> >>> >> _______________________________________________
> >>> >> SciPy-User mailing list
> >>> >> SciPy-User at scipy.org
> >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >>
> >>> >
> >>> > _______________________________________________
> >>> > SciPy-User mailing list
> >>> > SciPy-User at scipy.org
> >>> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >>> >
> >>> >
> >>>
> >>> --
> >>> View this message in context:
> >>> http://old.nabble.com/removing-for-loops...-tp28633477p28824023.html
> >>> Sent from the Scipy-User mailing list archive at Nabble.com.
> >>>
> >>> _______________________________________________
> >>> SciPy-User mailing list
> >>> SciPy-User at scipy.org
> >>> http://mail.scipy.org/mailman/listinfo/scipy-user
> >>>
> >>
> >>
> >
> > _______________________________________________
> > SciPy-User mailing list
> > SciPy-User at scipy.org
> > http://mail.scipy.org/mailman/listinfo/scipy-user
> >
> >
>
> --
> View this message in context:
> http://old.nabble.com/removing-for-loops...-tp28633477p28846602.html
> Sent from the Scipy-User mailing list archive at Nabble.com.
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20100610/5badc37a/attachment.html>