More urllib timeout issues.

Sat Apr 28 14:08:08 EDT 2007

Steve Holden wrote:
> John Nagle wrote:

>> Then we'd have a reasonable network timeout system.
>> We have about half of the above now, but it's not consistent.
>>
>> Comments?
>>
> The only comments I'll make for now are
> 
> 1) There is work afoot to build timeout arguments into network libraries 
> for 2.6, and I know Facundo Batista has been involved, you might want to 
> Google or email Facundo about that.

> 2) The main reason why socket.setdefaulttimeout is unsuitable for many 
> purposes is its thread-unsafe property, so all threads must use the same 
> default timeout or have it randomly change according to the whim of hte 
> last thread to alter it.

    It has other problems.  If you set that value, it affects
socket blocking/non blocking modes.  It can mess up M2Crypto, causing
it to report "Peer did not return certificate".

> 3) This is important and sensible work and if properly followed through 
> will likely lead to serious quality improvements in the network libraries.

    Agreed.

> regards
>  Steve

     I took a look at Facundo Batista's work in the tracker, and he
currently seems to be trying to work out a good way to test the
existing SSL module.  It has to connect to something to be tested,
of course.  Testing network functionality is tough; to do it right,
you need a little test network to talk to, one that forces some of
the error cases.  And network testing doesn't have the repeatability
upon which the Python test system/buildbot depends.

     It's really tough to test this stuff properly.  The best I've
been able to do so far is to run the 11,000 site list from the
Webspam Challenge through our web spider.

     Here's a list of URLs from our error log which
have given us connection trouble of one kind or another.
Most of these open an HTTP transaction, but for some reason,
don't carry it through to completion properly, resulting in
a long stall in urllib.

blaby.gov.uk
boys-brigade.org.uk
cam.ac.uk
essex.ac.uk
gla.ac.uk 

open.ac.uk
soton.ac.uk
uea.ac.uk
ulster.ac.uk

So that's a short, but useful, set of timeout test cases.  Those are the ones
that timed out after, not during, TCP connection opening.

It's interesting that this problem appears for the root domains of many
English universities.  They must all run the same server software.

Some of these fail because "robotparser", which uses "urllib", hangs
for minutes trying to read the "robots.txt" file associated with the
domain.

This isn't something that requires a major redesign.  These are bugs. 

				John Nagle