[Python-Dev] how to debug httplib slowness

Fri Sep 4 20:22:21 CEST 2009

On Fri, Sep 4, 2009 at 4:28 AM, Simon
Cross<hodgestar+pythondev at gmail.com> wrote:
> On Fri, Sep 4, 2009 at 1:11 PM, Chris Withers<chris at simplistix.co.uk> wrote:
>> Am I right in reading this as most of the time is being spent in httplib's
>> HTTPResponse._read_chunked and none of the methods it calls?
>>
>> If so, is there a better way that a bunch of print statements to find where
>> in that method the time is being spent?
>
> Well, since the source for _read_chunked includes the comment
>
>        # XXX This accumulates chunks by repeated string concatenation,
>        # which is not efficient as the number or size of chunks gets big.
>
> you might gain some speed improvement with minimal effort by gathering
> the read data chunks into a list and then returning "".join(chunks) at
> the end.

+1 on trying this. Constructing a 116MB string by concatenating 1KB
buffers surely must take forever. (116MB divided by 85125 recv() calls
give 1365 byte per chunk, which is awful.) The HTTP/1.0 business looks
like a red herring.

Also agreed that this is an embarrassment.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)