[SciPy-user] Fast saving/loading of huge matrices

Thu Apr 19 15:19:46 EDT 2007

I just changed from simply reading a text file using io.read_array to
cPickle and got a factor of 4 or 5 speed up for my medium sized array.
 But the cPickle file is quite large (about twice the size of the
ascii file - I don't think the ascii has very many digits).

I thought there used to be some built in functions called something
like shelve that stored dictionaries fairly quickly and compactly.
Are those functions still around and I am just remembering the name
wrong?  Or have they been done away with?  I remember vaguely that
they stored data in 3 seperate files - a python file that could later
be imported, a dat file  (I think) and something else.

The cPickle approach seems fast, I just wish there was some way to
make the files smaller.  Is there a good way to do this that doesn't
slow down the read time too much?

Thanks,

Ryan

On 4/19/07, Gael Varoquaux <gael.varoquaux at normalesup.org> wrote:
> On Thu, Apr 19, 2007 at 09:23:08AM -0500, Robert Kern wrote:
> > I think we've found that a simple pickle using protocol 2 works the
> > fastest. At the time (a year or so ago) this was faster than PyTables
> > for loading the entire array of about 1GB size. PyTables might be
> > better now, possibly because of the new numpy support.
>
> Thank you Robert. This is useful to know.
>
> Gaël
> _______________________________________________
> SciPy-user mailing list
> SciPy-user at scipy.org
> http://projects.scipy.org/mailman/listinfo/scipy-user
>