Replacement for the shelve module?

Thomas Jollans t at jollybox.de
Fri Aug 19 11:54:29 EDT 2011


On 19/08/11 17:31, Forafo San wrote:
> Folks,
> What might be a good replacement for the shelve module, but one that
> can handle a few gigs of data. I'm doing some calculations on daily
> stock prices and the result is a nested list like:
> 
> [[date_1, floating result 1],
>  [date_2, floating result 2],
> ...
>  [date_n, floating result n]]
> 
> However, there are about 5,000 lists like that, one for each stock
> symbol. Using the shelve module I could easily save them to a file
> ( myshelvefile['symbol_1') = symbol_1_list) and likewise retrieve the
> data. But shelve is deprecated AND when a lot of data is written
> shelve was acting weird (refusing to write, filesizes reported with an
> "ls" did not make sense, etc.).
> 
> Thanks in advance for your suggestions.

Firstly, since when is shelve deprecated? Shouldn't there be a
deprecation warning on http://docs.python.org/dev/library/shelve.html ?

If you want to keep your current approach of having an object containing
all the data for each symbol, you will have to think about how to
serialise the data, as well as how to store the documents/objects
individually. For the serialisation, you can use pickle (as shelve does)
or JSON (probably better because it's easier to edit directly, and
therefore easier to debug).
To store these documents, you could use a huge pickle'd Python
dictionary (bad idea), a UNIX database (dbm module, anydbm in Python2;
this is what shelve uses), or simple the file system: one file per
serialised object.

Looking at your use case, however, I think what you really should use is
a SQL database. SQLite is part of Python and will do the job nicely.
Just use a single table with three columns: symbol, date, value.

Thomas



More information about the Python-list mailing list