[Python-Dev] Cloning threading.py using proccesses

Josiah Carlson jcarlson at uci.edu
Wed Oct 11 18:38:48 CEST 2006


"M.-A. Lemburg" <mal at egenix.com> wrote:
> 
> Josiah Carlson wrote:
> > Fredrik Lundh <fredrik at pythonware.com> wrote:
> >> Josiah Carlson wrote:
> >>
> >>> Presumably with this library you have created, you have also written a
> >>> fast object encoder/decoder (like marshal or pickle).  If it isn't any
> >>> faster than cPickle or marshal, then users may bypass the module and opt
> >>> for fork/etc. + XML-RPC
> >> XML-RPC isn't close to marshal and cPickle in performance, though, so 
> >> that statement is a bit misleading.
> > 
> > You are correct, it is misleading, and relies on a few unstated
> > assumptions.
> > 
> > In my own personal delving into process splitting, RPC, etc., I usually
> > end up with one of two cases; I need really fast call/return, or I need
> > not slow call/return.  The not slow call/return is (in my opinion)
> > satisfactorally solved with XML-RPC.  But I've personally not been
> > satisfied with the speed of any remote 'fast call/return' packages, as
> > they usually rely on cPickle or marshal, which are slow compared to
> > even moderately fast 100mbit network connections.  When we are talking
> > about local connections, I have even seen cases where the
> > cPickle/marshal calls can make it so that forking the process is faster
> > than encoding the input to a called function.
> 
> This is hard to believe. I've been in that business for a few
> years and so far have not found an OS/hardware/network combination
> with the mentioned features.
> 
> Usually the worst part in performance breakdown for RPC is network
> latency, ie. time to connect, waiting for the packets to come through,
> etc. and this parameter doesn't really depend on the OS or hardware
> you're running the application on, but is more a factor of which
> network hardware, architecture and structure is being used.

I agree, that is usually the case.  But for pre-existing connections
remote or local (whether via socket or unix domain socket), pickling
slows things down significantly.  What do I mean?  Set up a daemon that
reads and discards what is sent to it as fast as possible.  Then start
sending it plain strings (constructed via something like 32768*'\0'). 
Compare it to a somewhat equivalently sized pickle-as-you-go sender. 
Maybe I'm just not doing it right, but I always end up with a slowdown
that makes me want to write my own fast encoder/decoder.


> It also depends a lot on what you send as arguments, of course,
> but I assume that you're not pickling a gazillion objects :-)

According to tests on one of the few non-emulated linux machines I have
my hands on, forking to a child process runs on the order of
.0004-.00055 seconds.  On that same machine, pickling...
    128*['hello world', 18, {1:2}, 7.382]
...takes ~.0005 seconds.  512 somewhat mixed elements isn't a gazillion,
though in my case, I believe it was originally a list of tuples or
somesuch.

> > I've had an idea for a fast object encoder/decoder (with limited support
> > for certain built-in Python objects), but I haven't gotten around to
> > actually implementing it as of yet.
> 
> Would be interesting to look at.

It would basically be something along the lines of cPickle, but would
only support the basic types of: int, long, float, str, unicode, tuple,
list, dictionary.

> BTW, did you know about http://sourceforge.net/projects/py-xmlrpc/ ?

I did not know about it.  But it looks interesting.  I'll have to
compile it for my (ancient) 2.3 installation and see how it does.  Thank
you for the pointer.


 - Josiah



More information about the Python-Dev mailing list