[Python-Help] Reading from socket file handle took too long

Etienne Desautels tiit at sympatico.ca
Fri Mar 3 14:57:41 EST 2006


Hi,

thanks Matthew for the answer. It help me think a little bit more.

I found a solution and I think I found the culprit but I would need to 
do more investigation to be sure.

My solution is to work directly with the socket library instead of 
using the higher level urllib2 library. Now it take only around 0,005 
s. to read and process each chunk of data (a chunk is the data that 
arrived in the last 1/24 s.) instead of between 0,04 s. and 0,1 s.. 
That's a lot faster!

I think the real culprit for slowing this process is in the httplib 
that urllib2 is using, and the problem come probably from the way 
httplib read from the socket. It look httplib have some problem reading 
when there's no EOF or something like that.

Etienne

On 06-02-28, at 12:58, Matthew Dixon Cowles wrote:

> Dear Etienne,
>
>> Hi,
>
> Hi!
>
>> I monitor every call and I found that the culprit is when I read
>> from the socket file handle. It's the only bottleneck in my code. I
>> try different approaches to the problem but I always hit the same
>> problem.
>
>> I don't think it's normal that read() take more then 0,04 sec to
>> read 1000 bytes from memory (the socket file handle). How can I
>> avoid this problem ?
>
>> temp = self.fh.read(self.limit)	# <-- TAKE A LOT OF TIME
>
> I agree with you that reading 1000 bytes from a socket shouldn't take
> that long. Especially since it should really amount to copying the
> data from one buffer to another.
>
> It's just a guess, but I would suspect that the reason that it takes
> that long to read a particular amount of data from the socket is that
> there isn't that much data available when you start the read.
>
> It ought to be easy enough to check that by reducing the amount you
> try to read by a lot and seeing if the call finishes a lot faster.
>
> If that turns out to be what's slowing you down, it might be useful
> to separate the job of reading from the job of decoding and doing
> whatever else you need to do with the data. To me, the obvious way to
> try to do that would be with threads. I've never used Twisted, but I
> imagine that it provides some mechanism for something like that.
>
> I hope that other folks here will also answer if they have a guess
> about what's happening.
>
> Regards,
> Matt
>




More information about the Python-list mailing list