Slow network reading?

Sat May 13 08:59:45 EDT 2006

Ivan Voras wrote:
> Andrew MacIntyre wrote:
> 
>> Comparative CPU & memory utilisation statistics, not to mention platform 
>> and version of Python, would be useful hints...
> 
> During benchmarking, all versions cause all CPU to be used, but Python 
> version has ~1.5x more CPU time allocated to it than PHP. Python is 2.4.1

A pretty fair indication of the Python interpreter doing a lot more work...

>> Note that the file-like object returned by makefile() has significant
>> portions of heavy lifting code in Python rather than C which can be a
>> drag on ultimate performance...  If on a Unix platform, it may be worth
>> experimenting with os.fdopen() on the socket's fileno() to see whether
>> the core Python file object (implemented in C) can be used in place of
>> the lookalike returned from the makefile method.

> That's only because I need the .readline() function. In C, I'm using 
> fgets() (with the expectation that iostream will buffer data).

The readline method of the file object lookalike returned by makefile
implements all of the line splitting logic in Python code, which is very
likely where the extra process CPU time is going.  Note that this code is
in Python for portability reasons, as Windows socket handles cannot be
used as file handles the way socket handles on Unix systems can be.

If you are running on Windows, a fair bit of work will be required to
improve performance as the line splitting logic needs to be moved to
native code - I wonder whether psyco could do anything with this?.

>> Even without that, you are specifying a buffer size smaller than the
>> default (8k - see Lib/socket.py). 16k might be even better.
> 
> The benchmark is such that all of data is < 200 bytes. I estimate that 
> in production almost all protocol data will be < 4KB.

A matter of taste perhaps, but that seems to me like another reason not
to bother with a non-default buffer size.

>> Although they're only micro-optimisations, I'd be interested in the
>> relative performance of the query method re-written as:
> 
> The change (for the better) is minor (3-5%).

Given your comments above about how much data is actually involved, I'm
a bit surprised that the tweaked version actually produced a measurable
gain.

-------------------------------------------------------------------------
Andrew I MacIntyre                     "These thoughts are mine alone..."
E-mail: andymac at bullseye.apana.org.au  (pref) | Snail: PO Box 370
        andymac at pcug.org.au             (alt) |        Belconnen ACT 2616
Web:    http://www.andymac.org/               |        Australia