A problem while using urllib

Johnny Lee johnnyandfiona at hotmail.com
Wed Oct 12 03:37:52 EDT 2005


Steve Holden wrote:
> Johnny Lee wrote:
> > Alex Martelli wrote:
> >
> >>Johnny Lee <johnnyandfiona at hotmail.com> wrote:
> >>   ...
> >>
> >>>   try:
> >>>      webPage = urllib2.urlopen(url)
> >>>   except urllib2.URLError:
> >>
> >>   ...
> >>
> >>>   webPage.close()
> >>>   return True
> >>>----------------------------------------------------
> >>>
> >>>   But every time when I ran to the 70 to 75 urls (that means 70-75
> >>>urls have been tested via this way), the program will crash and all the
> >>>urls left will raise urllib2.URLError until the program exits. I tried
> >>>many ways to work it out, using urllib, set a sleep(1) in the filter (I
> >>>thought it was the massive urls crashed the program). But none works.
> >>>BTW, if I set the url from which the program crashed to base url, the
> >>>program will still crashed at the 70-75 url. How can I solve this
> >>>problem? thanks for your help
> >>
> >>Sure looks like a resource leak somewhere (probably leaving a file open
> >>until your program hits some wall of maximum simultaneously open files),
> >>but I can't reproduce it here (MacOSX, tried both Python 2.3.5 and
> >>2.4.1).  What version of Python are you using, and on what platform?
> >>Maybe a simple Python upgrade might fix your problem...
> >>
> >>
> >>Alex
> >
> >
> > Thanks for the info you provided. I'm using 2.4.1 on cygwin of WinXP.
> > If you want to reproduce the problem, I can send the source to you.
> >
> > This morning I found that this is caused by urllib2. When I use urllib
> > instead of urllib2, it won't crash any more. But the matters is that I
> > want to catch the HTTP 404 Error which is handled by FancyURLopener in
> > urllib.open(). So I can't catch it.
> >
>
> I'm using exactly that configuration, so if you let me have that source
> I could take a look at it for you.
>
> regards
>   Steve
> --
> Steve Holden       +44 150 684 7255  +1 800 494 3119
> Holden Web LLC                     www.holdenweb.com
> PyCon TX 2006                  www.python.org/pycon/


I've sent the source, thanks for your help.

Regrads,
Johnny




More information about the Python-list mailing list