Urllib2/threading errors under Cygwin

Jacek Trzmiel sc0rp at hot.pl
Fri Apr 30 22:57:52 EDT 2004


Hi,

I have a problem with using urllib2 with threading module under Cygwin.

$ cygcheck -cd cygwin python
Cygwin Package Information
Package              Version        
cygwin               1.5.5-1        
python               2.3.2-1 

Here is minimal app where I can reproduce errors:

--- MtUrllib2Test.py -------------------------------------------------
#!/usr/bin/env python

import urllib2
import threading
import sys
import time


def FetchPage():
#    time.sleep(3)
#    return
    opener = urllib2.build_opener()
    urlFile = opener.open( 'http://google.com/' )
    pageData = urlFile.read()


def IncCounterAndPrint( count=[0] ):
    count[0] += 1
    print count[0]
#    sys.stdout.flush()


def Main():
    noOfThreads = 1
    for unused in range(noOfThreads):
        thread = threading.Thread( target=FetchPage, args=() )
        thread.start()
#        time.sleep(0.2)
        IncCounterAndPrint()

    while(threading.activeCount()>1):
        IncCounterAndPrint()
        time.sleep(0.5)
    IncCounterAndPrint()


if __name__ == "__main__":
    Main()
--- MtUrllib2Test.py -------------------------------------------------

0. Simple case.  Here everything looks ok:

$ python MtUrllib2Test.py
1
2
3
4
5


1. First error.

$ python MtUrllib2Test.py | tee out.txt 
3
4
5

Leading prints has been eaten somewhere.  Uncommenting disabled code in
ANY of the functions does make output correct, but none of the solutions
looks good for me:

a) IncCounterAndPrint() - sys.stdout.flush()
As I understand if stdout is not console then output gets buffered (i.e.
it's not flushed automatically).  Adding a flush call does make output
good, but this looks like a kludge for me, not a real fix.  I am writing
to stdout from only one thread, so everything should be fine without
calling flush, shouldn't it?

b) Main() - time.sleep(0.2)
Adding a little sleep after starting thread does make output correct
too.  For me it looks like race condition either in urllib2 or in
cygwin.  Or am I completely off here?

c) FetchPage() - time.sleep(3), return
Disabling calls to urllib2 does make problem go away, too.


2. Second error.

If I increase number of threads:
   noOfThreads = 20
and run this prog (you may need to run it several times, or rise number
of threads more to reproduce), then sometimes it does fail this way :

$ python MtUrllib2Test.py | tee out.txt 
      4 [win] python 1744 Winmain: Cannot register window class
C:\cygwin\bin\python2.3.exe: *** WFSO failed, Win32 error 6

or hangs this way:

$ python MtUrllib2Test.py | tee out.txt 
    243 [win] python 1696 Winmain: Cannot register window class
    520 [win] python 1696 Winmain: Cannot register window class



Can anyone help me on those two?

Best regards,
Jacek.




More information about the Python-list mailing list