threads and urllib

Sandy Norton sandskyfly at hotmail.com
Wed Feb 20 15:28:56 EST 2002


This simple little program retrieves web pages using threads and
urllib. Now it seems to work fine with many websites, however, more
often than not it just hangs there and the thread (or is it the
socket?) does not time out and die.

Is there anyway to specify a time limit in which the operation can
happen and enforce it in the code?

Any help will be much appreciated.

cheers,

Sandy

--------snip-----------------------------------------------------------

import urllib
import threading
import time

t1 = time.time()

class SerialAgent:
    def run(self):
        self.webpage = urllib.urlopen(self.url).read()
        #print self.webpage

class Agent(SerialAgent, threading.Thread):
    def __init__(self, url):
        self.url = url
        threading.Thread.__init__(self, name=url)
        self.webpage = ''

alist  = []

for url in ['http://www.python.org', 'http://www.zope.org']:
    a = Agent(url)
    print a
    a.start()
    print a
    alist.append(a)

for a in alist:
    a.join(20.0)
    print a, len(a.webpage)



More information about the Python-list mailing list