[Numpy-discussion] numpy sum table by category

Marc Schwarzschild ms at TheBrookhavenGroup.com
Tue Jan 12 15:33:02 EST 2010



I have a csv file like this:

    Account, Symbol, Quantity, Price
    One,SPY,5,119.00
    One,SPY,3,120.00
    One,SPY,-2,125.00
    One,GE,...
    One,GE,...
    Two,SPY, ...
    Three,GE, ...
     ...

The data is much larger, could be 10,000 records.  I can load it
into a numpy array using matplotlib.mlab.csv2rec().  I learned
several useful numpy functions and have been reading lots of
documentation.  However, I have not found a way to create a
unique list of symbols and the Sum of their respective Quantity
values.  I want do various calculations on the data like pull out
all the records for a given Account.  The actual data has lots
more columns and sometimes I may want to sum(Quantity*Price) by
Account and Symbol.

I'm attracted to numpy for speed but would welcome alternative
suggestions.

I tried unsuccessfully to install PyTables on my Ubuntu system
and abandoned that avenue.

Can anyone provide some examples on how to do this or point me to
documentation?

Much appreciated. 

_________________________________________________________
Marc Schwarzschild              The Brookhaven Group, LLC
1-212-580-1175         Analytics for Hedge Fund Investors
                 Risk it, carefully!
               




More information about the NumPy-Discussion mailing list