High performance IO on non-blocking sockets

Sat Mar 15 11:41:33 EST 2003

On Sat, Mar 15, 2003 at 12:33:57PM +0100, Troels Walsted Hansen wrote:
> Jp Calderone wrote:
> >> send_buffer = buffer(self.data, self.offset)
> >> sent = self.socket.send(send_buffer)
> >> self.buffer_offset += sent
> >  Have you timed this, vs the original, naive code?  
> 
> I have to admit that I haven't timed it. I've only looked at the Python
> source and based my statements on that.
> 
> > Slicing a string shouldn't copy any bytes, only create a new string
> > object with a modified starting pointer and length.
> 
> I believe this is wrong, at least for Python 2.2.2 which I'm working 
> with. Only if the string slice is the same as the original string do you 
> get an optimized slice without any copying. See source snippet below.
> 
> The buffer object is the one that implements read-only slices in the
> manner that you describe.
> 
> static PyObject *
> string_slice(register PyStringObject *a, register int i, register int j)
> {
>         [...]
>         if (i == 0 && j == a->ob_size && PyString_CheckExact(a)) {
>                 /* It's the same as a */
>                 Py_INCREF(a);
>                 return (PyObject *)a;
>         }
>         [...]
>         return PyString_FromStringAndSize(a->ob_sval + i, (int) (j-i));
> 

  Woops, I misinterpreted this code.

> > This seems as if it would be about as expensive as creating a new buffer
> > object, but has the advantage of running more of the work in C, rather
> > than Python (no name lookup for buffer, for example).
> >
> >  I ask because I can't seem to squeeze any speedup out of Twisted by
> > making this change (in fact, I find a significant slowdown, 12309.8 KB/s
> > using the original "buf = buf[sent:]" code to 9408.2 KB/s using the
> > buffer() approach).
> >
> >  I'm hoping I've just screwed something up, of course, and would love to
> > hear that the buffer() approach is, in fact, much faster :)
> 
> Your numbers are surprising and very interesting. Would you care to test 
> a modified send loop with the Twisted framework for me?
> 
>   if self.offset:
>       sent = self.socket.send(buffer(self.data, self.offset))
>   else:
>       sent = self.socket.send(self.data)
>   self.offset += sent
> 
> The idea here is to avoid the cost of creating a buffer object for short 
> sends that fit into the kernel's socket buffer.
> 

  Good call.  This approach does show a speedup.  With the same
settings as the previous numbers, the throughput rate rises to 12750 KB/s.

> How large is self.data in your test?

  I tried using three different file sizes (0.5MB, 10MB, 100MB), but since I
used the standard Twisted.web server, these got chunked up into 65KB pieces. 
I thought that this limited buffer size would reduce the effectiveness of
this optimization, so in addition to the change you suggested above, I
benchmarked the server with a ~4MB chunk size instead.  This only showed a
very minor speedup, 12778 KB/s, possibly one inside the error range for the
benchmark applied.

  For my edification, what are common sizes for kernel socket buffers, or
does it vary too widely to answer for anything but specific systems?

  Jp

-- 
"One World, one Web, one Program." - Microsoft(R) promotional ad
"Ein Volk, ein Reich, ein Fuhrer." - Adolf Hitler
-- 
 up 12 days, 7:59, 8 users, load average: 0.12, 0.17, 0.24