[IPython-dev] Fwd: Kernel-client communication

Thu Sep 2 09:16:57 EDT 2010

<snip>

Getting a robust and efficient message transport layer written is not
> easy work.  It takes expertise and detailed knowledge, coupled with
> extensive real-world experience, to do it right.  We simply decided to
> piggy back on some of the best that was out there, rather than trying
> to rewrite our own.  The features we gain from zmq (it's not just the
> low-level performance, it's also the simple but powerful semantics of
> their various socket types, which we've baked into the very core of
> our design) are well worth the price of a C dependency in this case.
>

I totally agree with you. And the more I read about zmq, the more I like it.
However, the C dependency is a larger hurdle for me than it is for you. You
see, IEP is an IDE that's independent of any Python installation. One can
just install IEP, and use it for any python version installed on the system.
I'd like to keep it that way, and that means I can only use pure Python my
kernel code (and the code should work on Python 2 and Python 3).

> > Further, am I right that the heartbeat is not necessary when
> communicating
> > between processes on the same box using 'localhost' (since some network
> > layers are bypassed)? That would give a short term solution for IEP.
>
> Yes, on local host you can detect the process via other mechanisms.
> The question is whether the system recovers gracefully from dropped
> messages or incomplete connections.  You do need to engineer that into
> the code itself, so that you don't lock up your client when the kernel
> becomes unresponsive, for example.
>

But is it at all possible to lose a connection when you connect two
processes using 'localhost'? Since it skips some of the lower layers of the
networking (http://docs.python.org/howto/sockets.html), I'd expect much less
can go wrong.

Concluding. I think I'll stick to my implementation for the time being. For
now, IEP will communicate 1to1 with a kernel, so I think it's pretty save as
long as long as the kernel runs on the same box and I use 'localhost'.
Later, when I implement remote processing and other more advanced stuff, I
might use zmq (and require users to install it in order to use
remote/parallel processing).

These discussion with you, and the stuff I read about zmq have got me
thinking, and I'll probably improve a few things here and there. I
definitely want to do some more testing (what happens if a connection is
lost? Maybe I can try to recover it...).

I should probably also explain that I do not use a request/reply pattern on
the socket level. I just connect the socket pair as the kernel starts up,
and then it keeps sending data in both directions. So what happens if the
kernel is busy running extension code, is that it will not be able to
socket.send() or socket.recv(). This means:

  * For the messages this should not be a problem, it will send the queued
messages as the program returns from the extension code.
  * There will be a couple of missed heartbeats though (but on the same box,
should not be a problem, right?).
  * I'm not sure what happens when the client tries to send large amounts of
data. Will network buffers overflow, or will this be correctly handled by
TCP? (I use "select" to check whether I can send/recv over the socket.)
  * If the connection is lost, I'll get a socket error. So maybe I don't
even need a heartbeat signal. However, I won't be able to distinguish a lost
connection from the process being killed.

 Thanks a lot for sharing your ideas, it's always super useful to look
> at these questions from multiple perspectives.
>

And thank you you for discussing this stuff with me. I appreciate it a lot!

  Almar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20100902/6d0c2448/attachment.html>