[Numpy-discussion] cannot pickle large numpy objects when memory resources are already stressed
Francesc Altet
faltet at carabos.com
Wed Mar 14 13:17:59 EDT 2007
El dc 14 de 03 del 2007 a les 09:46 -0700, en/na Travis Oliphant va
escriure:
> Glen W. Mabey wrote:
>
> >Hello,
> >
> >After running a simulation that took 6 days to complete, my script
> >proceeded to attempt to write the results out to a file, pickled.
> >
> >The operation failed even though there was 1G of RAM free (4G machine).
> >I've since reconsidered using the pickle format for storing data sets
> >that include large numpy arrays. However, somehow I assumed that one
> >would be able to pickle anything that you already had in memory, but I
> >see now that this was a rash assumption.
> >
> >Ought there to be a way to do this, or should I forget about being able
> >to bundle large numpy arrays and other objects in a single pickle?
If you can afford using another package for doing I/O perhaps PyTables
can save your day. It is optimized for saving a retrieving very large
amounts of data with ease. In particular, it can save your in-memory
arrays without a need to do another copy in memory (provided the array
is contiguous). It also allows compressing the data in a transparent
way, without a need of using additional memory.
Furthermore, a recent optimization introduced in the 2.0 branch a week
ago also allows to *update* an array on disk without doing copies
neither.
HTH,
--
Francesc Altet | Be careful about using the following code --
Carabos Coop. V. | I've only proven that it works,
www.carabos.com | I haven't tested it. -- Donald Knuth
More information about the NumPy-Discussion
mailing list