[IPython-dev] Implementing inline images in a kernel

Fernando Perez fperez.net at gmail.com
Wed Jan 29 18:03:25 EST 2014


Hi all,

A bit of additional context, not in response to any particular message but
just providing some extra info on where we come from...

I think it's worth looking at where the one notebook-one kernel assumption
we're trying to maintain comes from, why that's our stance, and why it does
*not* really limit much what others can do in terms of exploring new
approaches to interactive computing with IPython.

By keeping the notion of a *single*, 'official' kernel attached to a
notebook, we have a very clear and manageable model for our entire backend
infrastructure in terms of how the KernelManager has to be written, what
metadata has to go into the notebook files, how a new kernel can be
registered (see here -
https://hackpad.com/IPython-Winter-2014-Development-Meeting-fKrExqKCWmC#:h=Full,-integrated-support-for-n),
etc.

If we were to allow a notebook to directly expose potentially multiple
kernels, then *we* have to carry the burden of managing a cohort of
processes, tracking/killing all of them, handling signals to all of them,
having our server mange groups of sockets (a complete set per kernel), etc.
 We have to record that in the notebook files, define specs for kernel
groups, etc.  That complexity actually propagates pretty deeply across all
the pieces of IPython, and if there's one thing that unfortunately can't be
said about IPython, it's that it is in any way, shape or form, a 'simple'
project :)

Remember: we have a small team that is trying to build something very
ambitious, that walks a delicate line. This is a research project exploring
new modes of computational interaction and communication (from individuals
playing with their code to teaching, publication and book writing). But
*simultaneously*, it is a *production* tool being used at high schools,
universities, research facilities and companies.  That tension is always
there for us: we want to be able to try and explore novel ideas, yet we
can't simply hack something crazy that turns out to be a maintenance
nightmare in the future, because this isn't just 'research code', it's a
real-world production tool.

But in this context, we've seen how even the *current* design doesn't
preclude offering multiple languages to the end user, and even doing so in
various forms. IPython provides a number of magics (either built-in or
through extensions) that allow you to mix and match languages in several
ways:

- %%bash, %%perl and the like all fire a new subprocess call each time,
with zero persistent state between calls, and simply capture stdout/err and
print it back.

- %%julia, %%R and similar initialize a persistent sub-interpreter
in-process, with various degrees of in-memory data sharing.

- %%matlab and others initialize their interpreters out-of-process,
communicating with them via pipes, http or whatever other IPC mechanism
each uses.

- %%cython, %%fortran and similar just offer *compilation* of static code
fragments that then get loaded into the current interactive namespace for
reuse.

Because we chose the approach of using explicit %%cell magics, each
language is explicitly tagged at the top of the cell, and it's easy to know
what you're doing just from looking at the cell.  We're happy with that
model as it does provide easy access to lots of languages in a single
session when appropriate. For example, I often do sprinkle shell calls, and
I know people whose toolchain involves many command-line tools they can't
abandon use this extensively.

Now, Calico offers a different model, where the 'main' kernel is a much
thinner layer that basically just multiplexes, but where all its 'real
kernels' are all coexisting in-process. Given it's JVM-based, I actually
imagine with some careful coding all subkernels could work asynchronously
just fine, as long as they all manage correctly access to the necessary
messaging queues. The JVM has great threading support, and if each
subkernel runs in its own thread and doesn't block the 'main' kernel, the
whole thing could work even with full asynchronous feature support.

I think this shows that our current approach balances what we want
reasonably well:

1. the model we expose to end users is simple (one kernel) and contains
complexity for us as a project

2. the IPython kernel, via %%magics, provides a pragmatic (if limited, no
async operations in sublanguages, etc) solution for many common cases of
language mix-and-match that have real-world value.

3. other projects can still explore more ambitious and complex models of
execution, as Calico is rapidly demonstrating.


Honestly, I would say the above is *in general* our philosophy with IPython
(not just talking about kernels). I would rephrase the above three points
as part of our project guiding principles:

1. Build a coherent, as-simple-as-possible set of common abstractions we
can manage robustly and sustainably for the long haul.

2. Provide default, out-of-the-box tools on top of these abstractions that
solve many real-world problems, shipped officially as part of IPython.

3. Have these foundational abstractions be rich enough that we enable
third-party projects to explore novel, more ambitious directions on top of
them.  Some of those explorations may even come from our own team, but they
will happen outside the core codebase.


I hope this helps everyone understand better where we're coming from...

I should end by saying that at least I am *thrilled* to see this kind of
exploration, and we're always happy to see these kinds of ideas. We're also
receptive to such novel ideas potentially influencing the future of the
core, but only once it becomes clear that an exploratory direction really
proves to be generic, widely applicable and easy to manage and maintain.

Best,

f
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20140129/8689ca52/attachment.html>


More information about the IPython-dev mailing list