[IPython-dev] History

Fri Feb 18 12:01:59 EST 2011

I too am a bit hesitant to go the sqlite route on this without some
careful thinking.

Cheers,

Brian

On Fri, Feb 18, 2011 at 7:48 AM, Fernando Perez <fperez.net at gmail.com> wrote:
> Hi all,
>
> sorry that this will be brief and not very thought-through, but I can
> only sneak in short periods while at the conference...
>
> Many thanks to Thomas for getting this work going!  But I think, since
> we now have a bit more manpower and good momentum going, it's worth
> thinking a little about the key points we want to hit so we end up
> with something really solid.  Comments below...
>
> On Wed, Feb 16, 2011 at 5:10 PM, Thomas Kluyver <takowl at gmail.com> wrote:
>
>> - Each command is stored instantly, so we do away with the need for an
>> autosave timer thread. A crash at any stage should leave your entire history
>> intact up to the last command completed.
>
> Instant saving has one problem: frequent disk usage prevents hard
> drives from spinnning down when on battery.  The idea of an auto-save
> thread on a timer with a user-controllable delay has the advantage
> that the user can control their power consumption profile to fit their
> needs.
>
> On an international flight when you're trying to squeeze every last
> bit of your battery, this matters a lot.  We don't want to turn
> ipython into the thing that eats up your battery life to death just by
> virtue of running very simple interactive commands that in principle
> are purely CPU/memory resident, but because we generate lots of
> sideband disk activity.
>
> So we should keep this consideration in mind in the design.  If we
> don't think about it now, it will be much harder to retrofit a decent
> power profile later on.
>
>> - We store only raw history on disk (I think raw history is what we're
>> looking for 90% of the time). If we want translated history, we redo the
>> translation on the fly (this should require minimal computation, unless I've
>> missed something).
>> I've been having a think about how we store history:
>>
>> At present: Commands entered are stored in two lists (raw and "translated" -
>> i.e. turning magic commands into function calls). These are persisted to
>> disk at the next input every time 60 seconds elapse, with storage in a JSON
>> file, which is reloaded into the same lists (and into readline history) when
>> starting IPython. Each command entered is also immediately persisted in the
>> "shadow history", a collection of files in .ipython managed by the
>> pickleshare DB. The output objects are also stored in a dictionary by prompt
>> number, but I'm less concerned with that here.
>>
>> Uses:
>> - Readline history (getting previous commands via up arrow)
>> - Various magic commands (save, macro, hist) can access ranges of input,
>> using the prompt numbers from the current session.
>> - %hist -g allows searching shadow and current history with glob syntax.
>> - %rep can access ranges of this session's history, or single lines from
>> shadow history.
>> (These are all I've found so far - please let me know if there are others)
>
> No, we must store the translated history for two reasons:
>
> 1. Some translations are dynamic and context-dependent, so they can
> not be recomputed later (though these are the minority).
>
> 2. More importantly, the translation process is relatively cpu
> intensive, while disk space is the absolutely cheapest resource in
> existence (at least at the data storage volumes we're talking about
> here).  So it makes sense to store on disk these results once we have
> computed them, rather than recomputing them all over later on reload.
>
>> - History is indexed by session number and prompt number. This provides a
>> sensible behaviour if we have two IPython shells open together - the second
>> one to be opened will be the latter session (and will be able to access
>> commands entered in the other session as soon as they are completed).
>> - For magic commands, accessing a line from a previous session could look
>> like "-1#9" (9th line of immediately previous session).
>> - On starting IPython, we load the last (~40 lines/~2 sessions) from the
>> database into readline history.
>>
>> Thoughts? Have I overlooked some key reason we use the existing system? Is
>> there a better alternative to SQLite? Would you design it differently? I've
>> not written any code for this yet, so I'm open to ideas. But if people think
>> that makes sense, I'm volunteering to make it happen.
>
> Finally, but importantly, I'm somewhat reluctant to go to sqlite until
> we've fully shown that accomplishing our design goals is
> hard/impossible with a simple json persisted data structure.  While
> sqlite is indeed lightweight and available to us, it's also a more
> complex api to program against than simply storing a list/dict on
> disk.  I'm OK accepting that complexity price *if we need it*, but I'm
> not convinced we do yet.  Specifically, we need to answer: what is
> precisely the functionality that we want, that is hard/impossible to
> implement on json and easy/possible with sqlite?
>
> Cheers,
>
> f
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>

-- 
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu
ellisonbg at gmail.com