Numpy Performance

Fri Apr 24 09:05:57 EDT 2009

Thanks for your replies.

@Peter - My arrays are not sparse at all, but I'll take a quick look
as scipy.  I also should have mentioned that my numpy arrays are of
Object type as each data point (row) has one or more text labels for
categorization.

@Robert - Thanks for the comments about how numpy was optimized for
bulk transactions.  Most of the processing I'm doing is with
individual elements.

Essentially, I'm testing tens of thousands of scenarios on a
relatively small number of test cases.  Each scenario requires all
elements of each test case to be scored, then summarized, then sorted
and grouped with some top scores captured for reporting.

It seems like I can either work toward a procedure that features
indexed categorization so that my arrays are of integer type and a
design that will allow each scenario to be handled in bulk numpy
fashion, or expand RectangularArray with custom data handling methods.

Any other recommended approaches to working with tabular data in
Python?

Cheers,

Tim