[IPython-dev] First Performance Result

Sun Jul 25 23:49:36 EDT 2010

On Sun, Jul 25, 2010 at 14:49, Brian Granger <ellisonbg at gmail.com> wrote:

> Min,
>
> Thanks for this!  Sorry I have been so quiet, I have been sick for the last
> few days.
>
> On Thu, Jul 22, 2010 at 2:22 AM, MinRK <benjaminrk at gmail.com> wrote:
>
>> I have the basic queue built into the controller, and a kernel embedded
>> into the Engine, enough to make a simple performance test.
>>
>> I submitted 32k simple execute requests in a row (round robin to engines,
>> explicit multiplexing), then timed the receipt of the results (tic each 1k).
>> I did it once with 2 engines, once with 32. (still on a 2-core machine, all
>> over tcp on loopback).
>>
>> Messages went out at an average of 5400 msgs/s, and the results came back
>> at ~900 msgs/s.
>> So that's 32k jobs submitted in 5.85s, and the last job completed and
>> returned its result 43.24s  after the submission of the first one (37.30s
>> for 32 engines). On average, a message is sent and received every 1.25 ms.
>> When sending very small number of requests (1-10) in this way to just one
>> engine, it gets closer to 1.75 ms round trip.
>>
>>
> This is great!  For reference, what is your ping time on localhost?
>

ping on localhost is 50-100 us

>
>
>> In all, it seems to be a good order of magnitude quicker than the Twisted
>> implementation for these small messages.
>>
>>
> That is what I would expect.
>
>
>> Identifying the cost of json for small messages:
>>
>> Outgoing messages go at 9500/s if I use cPickle for serialization instead
>> of json. Round trip to 1 engine for 32k messages: 35s. Round trip to 1
>> engine for 32k messages with json: 53s.
>>
>> It would appear that json is contributing 50% to the overall run time.
>>
>>
> Seems like we know what to do about json now, right?
>

I believe we do: 1. cjson, 2. cPickle, 3. json/simplejson, 4. pickle.
Also: never use integer keys in message internals, and never use json for
user data.

>
>
>> With %timeit x.loads(x.dumps(msg))
>> on a basic message, I find that json is ~15x slower than cPickle.
>> And by these crude estimates, with json, we spend about 35% of our time
>> serializing, as opposed to just 2.5% with pickle.
>>
>> I attached a bar plot of the average replies per second over each 1000 msg
>> block, overlaying numbers for 2 engines and for 32. I did the same comparing
>> pickle and json for 1 and 2 engines.
>>
>> The messages are small, but a tiny amount of work is done in the kernel.
>> The jobs were submitted like this:
>>         for i in xrange(32e3/len(engines)):
>>           for eid,key in engines.iteritems():
>>             thesession.send(queue, "execute_request",
>> dict(code='id=%i'%(int(eid)+i)),ident=str(key))
>>
>>
>>
>
> One thing that is *really* significant is that the requests per/second goes
> up with 2 engines connected!  Not sure why this is the case by my guess is
> that 0MQ does the queuing/networking in a separate thread and it is able to
> overlap logic and communication.  This is wonderful and bodes well for us.
>

Yes, I only ran it for 1,2,32, but it's still a little faster at 32 than 2,
even on a 2 core machine.

> Cheers,
>
> Brian
>
>
>
>
> --
> Brian E. Granger, Ph.D.
> Assistant Professor of Physics
> Cal Poly State University, San Luis Obispo
> bgranger at calpoly.edu
> ellisonbg at gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20100725/c44dff6d/attachment.html>