[IPython-dev] storing variables *in* the notebook

Steve Holden steve at holdenweb.com
Thu Jan 26 16:43:01 EST 2017


Just a quick note about one comment re: distribution of data and notebooks
as separate files. At one point, when asked why this would not be
convenient, Zoltan said:

1. In many cases, I have multiple notebooks in a single directory, simply
> because most of the time, I really don't need a separate folder for just a
> single file.So, I have all notebooks that belong to a particular subject in
> a single folder, and I don't necessarily want to share all of them with
> others.


Mindless tasks like "pick these files out of this directory and put them in
a tar/zip file" are easily prone to automation, which would be far less
troublesome that modifying a complex architecture to accommodate something
not considered in its (in fact, rather careful) design.

Two ways come to mind: the first involves mostly shell interactions (I
include the Windows shells available, though at present I am ignorant of
them); the second would be a good exercise for a first-year undergraduate,
and therefore the kind of thing a working (i.e. there to use programs to
advance their research) might like to treat as competence practise (if they
have time, that isn't mandatory ;-).

In the first case, for each separate distribution you want to make you can
create a directory parallel to the one holding the notebooks and data, and
in it create symbolic links to point to the required files. These
directories can then be bundled with the standard *tar* utility, commanded
to copy the real files after following the links.

In the second case, each distribution would be represented as a data file
containing the paths to the files required, and they would be processed by
a Python program that essentially duplicates the same process as above.

I would personally prefer the latter process because, being data driven,
the configuration data can be made subject to change control, which with
proper configuration metadata included enhances repeatability and allows
you to reproduce and distribution on demand.

As a jobbing computational scientist who has spent long years discovering
wrong ways to do things, I will just point out that trying to push a design
beyond its intended limits is likely to impede the development towards the
main goal (though many improvements are also the result of user suggestions
and requests). Often there are much lower-complexity solutions available
that will satisfy your specific needs without imposing their cost on others.

This note is offered in a spirit of scientific sharing.I know that people
often struggle to use computers, because the activity of doing so is
peripheral to research. I learned to do things this way through long years
of experience. Ignorance is not a crime, and fortunately (unlike stupidity)
it can be cured in rational people by the application of information.

Anyway, back to work ...

regards
 Steve

Steve Holden

On Thu, Jan 26, 2017 at 9:06 PM, Klymak Jody <jklymak at gmail.com> wrote:

>
> I think this goes far beyond what I had in mind. I think this function or
> whatever would just be
>
> In [221]: x = long_calculation()  # x is 42
>                %store_in_notebook x
>
> and in the new session
>
> In [1]: %store_in_notebook -restore_variables
> In [2]: x
> Out [2]: 42
>
>
> For my taste, I’d just save that result in a file (`pickle` or `shelf`, or
> netcdf if I wanted to be formal).  Its a lot more transparent what is going
> on.
>
> Imagine this case:  I `%store_in_notebook` the results of a long
> calculation, and then remove that code from the notebook for some reason.
> I might very well wonder a year from now why my notebook is 50 Gb, and have
> no documentation of how it got that way.
>
> However, if you do have a whole slew of variables you suddenly want to
> save, did you try `dill`?
>
> import dill
> import numpy as np
>
> filename= 'globalsave.pkl'
>
> if 1:
>     x = np.arange(20)
>     dill.dump_session(filename)
> else:
>     dill.load_session(filename)
>
> Cheers,  Jody
>
>
>
>
>
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> https://mail.scipy.org/mailman/listinfo/ipython-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20170126/567bea79/attachment.html>


More information about the IPython-dev mailing list