[IPython-dev] Uniform way of integrating event loops among different IDE's

Fernando Perez fperez.net at gmail.com
Mon Aug 30 03:14:39 EDT 2010


Hi Almar,

On Fri, Aug 27, 2010 at 4:10 PM, Almar Klein <almar.klein at gmail.com> wrote:
> I've read some of the documentation for the new stuff you're working on. It
> all sounds really well thought through, and am looking forwards for the
> results. I've got a couple of questions though.

Thanks!  I hope it's well thought out, it's definitely *very* thought
out, but unfortunately those two things aren't always the same :)
Critical feedback very, very welcome.

> - I see the possibilities of distributed computing by connecting multiple
> kernels to a single client. However, I don't get why you would want to
> connect multiple clients to a single kernel at the same time?

Collaboration: you're working on a problem and would like to discuss
it with a colleague.  She opens a frontend pointed to your same kernel
and voilà, you're sharing a set of data and code you can both work on,
type code into, make plots from, etc.  Think of it like desktop
sharing but for code.

Ad-hoc monitoring of a computation: you have a kernel you left in the
office running a long computation. From the bar, you log in with your
Android frontend, view the information it's printing, and log out
knowing that everyting is OK.  Or you stop it when you realize
something went crazy.

Ad-hoc continuation of work: you go home for the day and leave a
session open at work.  All of a sudden you have an idea and would like
to test it, but it depends on a bunch of long computations you've
already run at work and varaiables that are sitting in that session.
No problem, just connect to it, try something out and disconnect again
when satisfied.

Monitoring: you can set up a 'read-only' client that monitors a kernel
and publishes its output somewhere (logs, http, sms, whatever).

There's plenty more, I'm sure.  These are just a few that quickly come to mind.

> - I saw an example in which you're kind of going towards a Mathematica/Sage
> type of UI. Is this what you're really aiming at, or is this one possible
> front end? I'm asking because IEP has more of a Matlab kind of UI, with an
> editor from which the user can run code (selected lines or cells: code
> between two lines starting with two ##'s). Would that be compatible with the
> kernel you're designing?

Absolutely!  We want *both* types of interface.  Evan's frontend is
more of a terminal widget that could be embedded in an IDE, while
Gerardo's has more the feel of a Qt-based notebook.  And obviously as
soon as an HTTP layer is written, something like the Sage notebook
becomes the next step.  Several of us are long-time Mathematica users
and use Sage regularly, so those interfaces have obviously shaped our
views a lot.  But what we're trying to build is a *protocol* and
infrastructure to make multiple types of client possible.

> - About the heartbeat thing to detect whether kernels are still alive. I use
> a similar concept in the channels module. I actually never realized that
> this would fail if Python is running extension code. However, I do run
> Cython code that takes about a minute to run without problems. Is that
> because it's Cython and the Python interpreter is still involved? I'll do
> some test running Cython and C code next week.

The question is whether your messaging layer can continue to function
if you have a long-running computation that's not in Python.  You can
easily see that by just calling a large SVD, eigenvalue decomposition
or FFT from scipy, things that are easy to make big and that are
locked inside some Fortran routine for a long time.  In that scenario,
your program will not touch the python parts until the Fortran (or
pure C) finish.  Whether that's detrimental to your overall app or not
depends on how the other parts handle one component being unresponsive
for a while.

In our case obviously the kernel itself remains unresponsive, but the
important part is that the networking doesn't suffer.  So we have
enough information to take action even in the face of an unresponsive
kernel.

> Since I think its interesting to see that we've taking rather different
> approaches to do (more or less) the same thing, I'll share some background
> on what I do in IEP:
>
> I use one Channels instance from the channels.py module, which means all
> communication goes over one socket. However, I can use as many as 128
> different channels each way. Instead of a messaging format, I use a channel
> for each task. By the way, I'm not saying my method is better; yours is
> probably more "scalable", mine requires no/little message processing. So
> from the kernel's perspective, I have one receiving channel for stdin, two
> sending for stdout and stderr, one receiving for control (mostly debugging
> at the moment) and one sending for status messages (whether busy/ready, and
> debug info). Lastly there's one receiving and one sending channel for
> introspection requests and responses.

Interesting... But I imagine each channel requires a socket pair,
right?  In that case then you'll definitely have problems if you want
to have hundreds/thousasnds of kernels, as you'll eventually run out
of ports for connections.  Since that's a key part of ipython, we need
a design that scales well in that direction from the get-go.  But I
see how your approach provides you with important benefits in certain
contexts.

> To receive code, sys.stdin is replaced with a receivingChannel from
> channels.py, which is non-blocking. The readline() method (which is what
> raw_input() uses) *is* blocking, so that raw_input() behaves appropriately.
>
> The remote process runs an interpreter loop. Each iteration the interpreter
> checks (non-blocking) the stdin for a command to be run. If there is, it
> does so using almost the same code in code.py. Next (if required) process
> GUI events. Next produce prompt if necessary, and send status. In another
> thread, there is a loop that listens for introspection requests
> (auto-completion, calltips, docs).

What happens if the user wants to execute in the remote process which
itself calls raw_input()?  For example, can one call pdb in
post-mortem mode in the remote process?


In any case, thanks a lot for your interest!

Especially now with your license change, it would be wonderful if the
two projects could collaborate more closely.

All the best,

f



More information about the IPython-dev mailing list