[Python-Dev] how to debug httplib slowness

Chris Withers chris at simplistix.co.uk
Fri Sep 4 13:11:46 CEST 2009


Guido van Rossum wrote:
>>> You might see a pattern. Is this on Windows?
>> Well, yes, but I'm not 100%. The problematic machine is a Windows box, but
>> there are no non-windows boxes on that network and vpn'ing from one of my
>> non-windows boxes slows things down enough that I'm not confident what I'd
>> be seeing was indicative of the same problem...
> 
> Time to set up a more conclusive test. Do you have something like curl
> or wget available on the same box?

Time taken with IE: ~2 seconds
Time taken with wget: 2.2 seconds
Time taken with Python [1]: 20-30 minutes

I did a run of the script through cProfile and got the following:

pstats.Stats('download.profile').strip_dirs().sort_stats('time').print_stats(10)

          1604545 function calls in 1956.057 CPU seconds

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1 1950.767 1950.767 1955.952 1955.952 httplib.py:544(_read_chunked)
     85125    1.235    0.000    1.235    0.000 {method 'recv' of 
'_socket.socket' objects}
     85838    1.031    0.000    2.246    0.000 socket.py:313(read)
     85838    0.787    0.000    3.386    0.000 httplib.py:601(_safe_read)
     42928    0.614    0.000    1.779    0.000 socket.py:373(readline)
    128775    0.344    0.000    0.344    0.000 {method 'write' of 
'cStringIO.StringO' objects}
    200796    0.206    0.000    0.206    0.000 {method 'seek' of 
'cStringIO.StringO' objects}
     85838    0.179    0.000    0.179    0.000 {min}
    128767    0.135    0.000    0.135    0.000 {cStringIO.StringIO}
     72735    0.116    0.000    0.116    0.000 {method 'read' of 
'cStringIO.StringO' objects}

...which isn't what I was expecting!

Am I right in reading this as most of the time is being spent in 
httplib's HTTPResponse._read_chunked and none of the methods it calls?

If so, is there a better way that a bunch of print statements to find 
where in that method the time is being spent?

cheers,

Chris

[1] Python 2.6.2 on Windows Server 2003 R2 running this script:

from base64 import encodestring
from httplib import HTTPConnection
from datetime import datetime

conn = HTTPConnection('servername')
headers = {}
a = 'Basic '+encodestring('username:password').strip()
headers['Authorization']=a
t = datetime.now()
print t
conn.request('GET','/some/big/file',None,headers)
print 'request:',datetime.now()-t
response = conn.getresponse()
print 'response:',datetime.now()-t
data = response.read()
if len(data)<2000: print data
print 'read:',datetime.now()-t


-- 
Simplistix - Content Management, Batch Processing & Python Consulting
            - http://www.simplistix.co.uk


More information about the Python-Dev mailing list