Determine the best buffer sizes when using socket.send() and socket.recv()

Greg Copeland gtcopeland at gmail.com
Fri Nov 14 21:34:57 EST 2008


On Nov 14, 1:58 pm, "Giampaolo Rodola'" <gne... at gmail.com> wrote:
> On Nov 14, 5:27 pm, Greg Copeland <gtcopel... at gmail.com> wrote:
>
>
>
> > On Nov 14, 9:56 am, "Giampaolo Rodola'" <gne... at gmail.com> wrote:
>
> > > Hi,
> > > I'd like to know if there's a way to determine which is the best
> > > buffer size to use when you have to send() and recv() some data over
> > > the network.
> > > I have an FTP server application which, on data channel, uses 8192
> > > bytes as buffer for both incoming and outgoing data.
> > > Some time ago I received a report from a guy [1] who stated that
> > > changing the buffers from 8192 to 4096 results in a drastical speed
> > > improvement.
> > > I tried to make some tests by using different buffer sizes, from 4 Kb
> > > to 256 Kb, but I'm not sure which one to use as default in my
> > > application since I noticed they can vary from different OSes.
> > > Is there a recommended way to determine the best buffer size to use?
>
> > > Thanks in advance
>
> > > [1]http://groups.google.com/group/pyftpdlib/browse_thread/thread/f13a82b...
>
> > > --- Giampaolohttp://code.google.com/p/pyftpdlib/
>
> > As you stated, the answer is obviously OS/stack dependant. Regardless,
> > I believe you'll likely find the best answer is between 16K-64K. Once
> > you consider the various TCP stack improvements which are now
> > available and the rapid increase of available bandwidth, you'll likely
> > want to use the largest buffers which do not impose scalability issues
> > for your system/application. Unless you have reason to use a smaller
> > buffer, use 64K buffers and be done with it. This helps minimize the
> > number of context switches and helps ensure the stack always has data
> > to keep pumping.
>
> > To look at it another way, using 64k buffers requires 1/8 the number
> > of system calls and less time actually spent in python code.
>
> > If as you say someone actually observed a performance improvement when
> > changing from 8k buffers to 4k buffers, it likely has something to do
> > with python's buffer allocation overhead but even that seems contrary
> > to my expectation. The referenced article was not available to me so I
> > was not able to follow and read.
>
> > Another possibility is 4k buffers require less fragmentation and is
> > likely to perform better on lossy connections. Is it possible he/she
> > was testing on a high lossy connection? In short, performance wise,
> > TCP stinks on lossy connections.- Hide quoted text -
>
> > - Show quoted text -
>
> Thanks for the precious advices.
> The discussion I was talking about is this one (sorry for the broken
> link, I didn't notice that):http://groups.google.com/group/pyftpdlib/browse_thread/thread/f13a82b...
>
> --- Giampaolohttp://code.google.com/p/pyftpdlib/

I read the provided link. There really isn't enough information to
explain what he observed. It is safe to say, his report is contrary to
common performance expectations and my own experience. Since he also
reported large swings in bandwidth far below his potential max, I'm
inclined to say he was suffering from some type of network
abnormality. To be clear, that's just a guess. For all we know some
script kiddie was attempting to scan/hack his system at that given
time - or any number of other variables. One can only be left making
wild assumptions about his operating environment and it's not even
clear if his results are reproducible. Lastly, keep in mind, many
people do not know how to properly benchmark simple applications, let
alone accurately measure bandwidth.

Keep in mind, python can typically saturate a 10Mb link even on fairly
low end systems so it's not likely your application was his problem.
For now, use large buffers unless you can prove otherwise.



More information about the Python-list mailing list