Fastest way to store ints and floats on disk

Laszlo Nagy gandalf at shopzeus.com
Fri Aug 8 05:25:14 EDT 2008


>
> Hmm... I wrote an browser based analysis tool and used the working 
> name pyvot...
Is this for the public domain?
>
> I found Numeric to provide the best balance of memory footprint and 
> speed. I also segregated data prep into a separate process to avoid 
> excessive memory use at run time. Turns out python
Do you mean numpy? (Numeric is not actively supported, numarray is 
deprecated.)
>
> For the site I'm at, I've got 10 years sales history recapped from 
> 4327846 detail records into 458197 item by customer by month records 
> and top shows a 240Mb memory footprint. I've got 21 cross indexed 
> selection fields, and can display up to six data types (qty, price, 
> sqft, cost, gp%, avg). At another site I've got approx 8.5M records 
> recapped into 1M records with 15 indexes and 5 years monthly history 
> living in a 540Mb memory footprint.
>
> It's reasonably quick: a query like 'select san mateo, foster city and 
> san carlos accounts, sort by customer and product category and display 
> this year's sales by month' selects 260 records and renders in the 
> browser in about 2 seconds. Or on the larger installation 'Show sales 
> for the past five years for product group 12 sorted by city within 
> route' selects 160 records and renders in about 3 seconds.
Incredible. :-)
> My objective was to keep the info in memory for fast response times.
Permature optimalization is the root of all evil. (Who said that?) All 
right, first I'll try to keep it in memory and come back if that doesn't 
work out.
> I played a lot of games getting this all to work well, including some 
> c extensions, but Numeric's take, sum, tostring and fromstring ended 
> up with 'pivotal' roles. :)
Thank you,

Laszlo




More information about the Python-list mailing list