numpy and filtering (was: Fastest way to store ints and floats on disk)

Laszlo Nagy gandalf at shopzeus.com
Fri Aug 8 07:06:36 EDT 2008


Attached there is an example program that only requires numpy. At the 
end I have two numpy array:

rdims:

[[3 1 1]
 [0 0 4]
 [1 3 0]
 [2 2 0]
 [3 3 3]
 [0 0 2]]


rmeas:

[[100000.0 254.0]
 [40000.0 200.0]
 [50000.0 185.0]
 [5000.0 160.0]
 [150000.0 260.0]
 [20000.0 180.0]]


I would like to use numpy to create statistic, for example the mean 
value of the prices:

 >>> rmeas[:,0] # Prices of cars
array([100000.0, 40000.0, 50000.0, 5000.0, 150000.0, 20000.0], 
dtype=float96)
 >>> rmeas[:,0].mean() # Mean price
60833.3333333333333321

However, I only want to do this for 'color=yellow' or 'year=2003, 
make=Ford' etc. I wonder if there a built-in numpy method that can 
filter out rows using a set of values. E.g. create a view of the 
original array or a new array that contains only the filtered rows. I 
know how to do it from Python with iterators, but I wonder if there is a 
better way to do it in numpy. (I'm new to numpy please forgive me if 
this is a dumb question.)

Thanks,

   Laszlo




-------------- next part --------------
A non-text attachment was scrubbed...
Name: test001.py
Type: text/x-python
Size: 1735 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20080808/99a2dd4f/attachment.py>


More information about the Python-list mailing list