[IPython-dev] Qt/Curses interfaces future: results of the weekend mini-sprint (or having fun with 0mq)

Brian Granger ellisonbg at gmail.com
Wed Mar 24 23:41:35 EDT 2010


Mikhail,

> IMHO it is a great idea to separate the main IPython engine from the
> frontend.
> But while implementing an RPC framework over 0mq from ground up should
> not be a very difficult task and will definitely bring you a lot of fun,
> have you
> considered something preexisting like RPyC (http://rpyc.wikidot.com/) for
> example.

We have considered everything :).  The story of how we have arrived at
0MQ is pretty interesting and worth recording.  We have had
implementations based on XML-RPC, Twisted (numerous protocols, HTTP,
PB, Foolscap) and raw sockets. I have played with earlier versions of
RPyC as well.

There are a couple of issue we keep running into with *every* solution
we have tried (except for 0MQ):

* The GIL kills.  Because IPython is designed to execute arbitrary
user code, and our users often run wrapped C/C++ libraries, it is not
uncommon for non-GIL releasing code to be run in IPython.  When this
happens, any Python thread *completely stops*.  When you are building
a robust distributed systems, you simply can't have this.  As far as I
know all Python based networking and RPC libraries suffer from this
same exact issue.  Note: it is not enough that the underlying socket
send/recv happen with the GIL released.

* Performance. We need network protocols that have near ping latencies
but can also easily handle many MB - GB sized messages at the same
time.  Prior to 0MQ I have not seen a network protocols that can do
both.  Our experiments with 0MQ have been shocking.  We see near ping
latencies for small messages and can send massive messages without
even thinking about it.  All of this is while CPU and memory usage is
minimal.  One of the difficulties that networking libraries in Python
face (at least currently) is that they all use strings for network
buffers.  The problem with this is that you end up copying them all
over the place.  With Twisted, we have to go to incredible lengths to
avoid this.  Is the situation different with RPyC?

* Messaging not RPC.  As we have developed a distributed architecture
that is more and more complex, we have realized something quite
significant: we are not really doing RPC, we are sending messages in
various patterns and 0MQ encodes these patterns extremely well.
Examples are request/reply and pub/sub, but other more complex
messaging patterns are possible as well - and we need those. In my
mind, the key difference between RPC is the presence of message queues
in an architecture.  Multiprocessing has some of this actually, but I
haven't looked at what they are doing underneath the hood.  I
encourage you to look at the example Fernando described.  It really
shows in significant ways that we are not doing RPC.

> The reason is that IPython already has a lot of useful and exciting
> functionality and yet another RPC framework is somewhat too much. Plus,
> you don't have to think about these too low level details like communication
> protocols, serialization etc.

0MQ is definitely not another RPC framework.  If you know that RPyC
addresses some or all of these issue I have brought up above, i would
seriously love to know.  One of these days, I will probably try to do
some benchmarks that compare twisted, multiprocessing, RPyC and 0MQ
for things like latency and throughput.  That would be quite
interesting.

Another important part of 0MQ is that is runs over protocols other
than tcp and interconnects like infiniband.  The performance on
infiniband is quite impressive.

Great question.

Cheers,

Brian

> Regards,
> --
> Mikhail Terekhov
>
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>
>



-- 
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu
ellisonbg at gmail.com



More information about the IPython-dev mailing list