urllib threads?

Andres Jaime mirv123 at yahoo.com
Wed Mar 27 11:18:02 EST 2002


I'm creating a thread as follows:

class t(Thread):
    def __init__(self, url, out_file_name, proxy):
        Thread.__init__(self)
        self.url = url
        self.out_file_name = out_file_name
        self.proxy = proxy
    def run(self):
        get.Get(self.out_file_name, self.url, HTTP_PROXY = self.proxy)

where get.Get is defined as follows:

(I found this code here in comp.lang.python and subsequently modified
it to log)
def Get(out_file_name, url, HTTP_PROXY=None, ACCEPT_HEADER=None,
SHOW=None, **fields):
    proxy = None
    if HTTP_PROXY: proxy = { 'http': HTTP_PROXY }
    d = []
    for key, value in fields.items():
        d.append('%s=%s' % (key, urllib.quote_plus(value)))

    args = string.join(d, '&')
    if args:
        request = url + '?' + args
    else:
        request = url
    try:
        u = urllib.FancyURLopener(proxy)
        if ACCEPT_HEADER:
            if type(ACCEPT_HEADER) != type(''):
                ACCEPT_HEADER = string.join(ACCEPT_HEADER, ',')
            u.addheader('Accept', ACCEPT_HEADER)
        fn, h = u.retrieve(request)
        u.cleanup()
    except:
        str = ''
        urlcleanup()
        if HTTP_PROXY == None:
            str = 'error getting url: ' + url + '\n'
        else:
            str = 'error getting url: ' + url + ' through proxy: ' +
HTTP_PROXY + '\n'
        outfile = open(out_file_name, 'a')
        outfile.write(str)
        outfile.close()
    else:
        str = ''
        if HTTP_PROXY == None:
            str = 'success getting url: ' + url + '\n'
        else:
            str = 'success getting url: ' + url + ' through proxy: ' +
HTTP_PROXY + '\n'
        outfile = open(out_file_name, 'a')
        outfile.write(str)
        outfile.close()

when I create an instance of the class t and call t.run(), the main
thread blocks untill the page has been requested and then the function
get.Get() logs either error or success in the logfile.

On the other hand, when I create an instance of class t and call
t.start() (because it inherits Thread, start will spawn a new thread
and subsequently call run()), I see evidence of neither success or
failure in the log file, which leads me to believe that the thread is
either not being spawned correctly, or that it's dying in u.retrieve()
somehow?

does anyone have any insight into this problem (issues with urllib and
multithreading, etc)

-andres

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Andres Jaime
Lucira Technologies, Inc.         
What would you do if someone stole your laptop?
Protect your hardware and sensitive data with MobileSecure
Now available for Download:   http://www.lucira.com 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~



More information about the Python-list mailing list