numpy and filtering (was: Fastest way to store ints and floats on disk)

Robert Kern robert.kern at gmail.com
Fri Aug 8 15:59:04 EDT 2008


Laszlo Nagy wrote:
> Attached there is an example program that only requires numpy. At the 
> end I have two numpy array:
> 
> rdims:
> 
> [[3 1 1]
> [0 0 4]
> [1 3 0]
> [2 2 0]
> [3 3 3]
> [0 0 2]]
> 
> 
> rmeas:
> 
> [[100000.0 254.0]
> [40000.0 200.0]
> [50000.0 185.0]
> [5000.0 160.0]
> [150000.0 260.0]
> [20000.0 180.0]]
> 
> 
> I would like to use numpy to create statistic, for example the mean 
> value of the prices:
> 
>  >>> rmeas[:,0] # Prices of cars
> array([100000.0, 40000.0, 50000.0, 5000.0, 150000.0, 20000.0], 
> dtype=float96)
>  >>> rmeas[:,0].mean() # Mean price
> 60833.3333333333333321
> 
> However, I only want to do this for 'color=yellow' or 'year=2003, 
> make=Ford' etc. I wonder if there a built-in numpy method that can 
> filter out rows using a set of values. E.g. create a view of the 
> original array or a new array that contains only the filtered rows. I 
> know how to do it from Python with iterators, but I wonder if there is a 
> better way to do it in numpy. (I'm new to numpy please forgive me if 
> this is a dumb question.)

It's not, but you will get more help on the numpy-discussion mailing list than here.

   http://www.scipy.org/Mailing_Lists

I would normally answer your question, too, but I'm on vacation and have to run 
off to a party right now.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco




More information about the Python-list mailing list