[Numpy-discussion] very simple iteration question.

Damian Eads eads at soe.ucsc.edu
Wed Apr 30 04:40:44 EDT 2008


Hi Alex,

a g wrote:
> Hi.  This is a very basic question, sorry if it's irritating.  If i
> didn't find the answer written already somewhere on the site, please
> point me to it.  That'd be great.

You should look at any of the documents below and read up on array 
slicing. It is perhaps the most important and pervasive concept of Numpy 
and should be understood by all users.

     Numpy Tutorial: http://www.scipy.org/Tentative_NumPy_Tutorial
     Numpy for MATLAB users: http://www.scipy.org/NumPy_for_Matlab_Users
     Guide to Numpy

> OK: how do i iterate over an axis other than 0?
> 
> I have a 3D array of data[year, week, location].  I want to iterate
> over each year at each location and run a series of stats on the
> columns (on the 52 weeks in a particular year at a particular location).
>  'for years in data:' will get the first one, but then how do i not
> iterate over the 1 axis and iterate over the 2 axis instead?

It is not clear to me whether you want to slice or iterate over an 
array. Assuming you are fixing the year and location, the following code 
iterates over data for fixed year and location.

for week in xrange(0, 52):
     <do something with> data[year, week, loc]

Slicing is more efficient and you should use it if you can. Fixing the 
year and location, the following computes the mean and standard 
deviation across all weeks. All of the statements below yield scalars.

     data[year, :, loc].mean() -- takes the mean of the data across weeks
     data[year, :, loc].std() -- takes the standard deviation of the 
data across weeks

You should download IPython and type help(numpy.array) to see one set of 
functions you can call on the result of a slice (sum, min, etc.).

Although I don't know what statistics you are computing for sure, the 
following code might be useful since it computes a statistic across all 
weeks for each year and location value.

     data.mean(axis=1)

It yields a num_years by num_locations array mu where mu[y, l] is the 
average data value across all weeks for year y and loc l.

I hope this helps.

Damian



More information about the NumPy-Discussion mailing list