[Numpy-discussion] saving groups of numpy arrays to disk

Anthony Scopatz scopatz at gmail.com
Wed Aug 24 12:22:10 EDT 2011


On Sun, Aug 21, 2011 at 7:24 AM, Pauli Virtanen <pav at iki.fi> wrote:

> On Sat, 20 Aug 2011 16:18:55 -0700, Chris Withers wrote:
> > I've got a tree of nested dicts that at their leaves end in numpy arrays
> > of identical sizes.
> >
> > What's the easiest way to persist these to disk so that I can pick up
> > with them where I left off?
>
> Depends on your requirements.
>
> You can use Python pickling, if you do *not* have a requirement for:
>
> - real persistence, i.e., being able to easily read the data years later
> - a standard data format
> - access from non-Python programs
> - safety against malicious parties (unpickling can execute some code
>  in the input -- although this is possible to control)
>
> then you can use Python pickling:
>
>        import pickle
>
>        file = open('out.pck', 'wb')
>        pickle.dump(file, tree, protocol=pickle.HIGHEST_PROTOCOL)
>        file.close()
>
>        file = open('out.pck', 'rb')
>        tree = pickle.load(file)
>        file.close()
>
> This should just work (TM) directly with your tree-of-dicts-and-arrays.
>
> > What's the most "correct" way to do so?
> >
> > I'm using IPython if that makes things easier...
> >
> > I had wondered about PyTables, but that seems a bit too heavyweight for
> > this, unless I'm missing something?
>
> If I had one or more of the requirements listed above, I'd use the HDF5
> format, via either PyTables or h5py. If I'd just need to cache the trees,
> then I'd use pickling.
>
> I think the only reason to consider heavy-weighedness is distribution:
> does your target audience have these libraries already installed
> (they are pre-installed in several Python-for-science distributions),
> and how difficult would it be for you to ship them with your stuff,
> or to require the users to install them.
>

+1 to PyTables or h5py.


>
> --
> Pauli Virtanen
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/d610fd1f/attachment.html>


More information about the NumPy-Discussion mailing list