FTP hangs with urllib.urlretrieve

Oleg Broytmann phd at phd.russ.ru
Tue Mar 7 05:15:43 EST 2000


Hello!

   I am developing/running URL checker (sorry, yet another one :). And
found that I have consistent hangings on some FTP URLs.

   Here is test program I am using to retest my results - just URLopener
and urlretrieve, nothing magical; python is stock 1.5.2, hangs on all
platforms I am using - Pentium Linux, Sparc Solaris, Pentium FreeBSD:


----------
#! /usr/local/bin/python -O


import sys, urllib
urllib._urlopener = urllib.URLopener()

# Some sites allow only Mozilla-compatible browsers; way to stop robots?
server_version = "Mozilla/3.0 (compatible; Python-urllib/%s)" % urllib.__version__
urllib._urlopener.addheaders[0] = ('User-agent', server_version)


url = sys.argv[1]
print "Testing", url

try:
   fname, headers = urllib.urlretrieve(url)
   print fname
   print headers

except Exception, msg:
   print msg
   import traceback; traceback.print_exc()
----------

   The program always hangs on some (but not all) FTP URLs. One is
well-known for Python community: ftp://starship.python.net/pub/crew/jam/ :)
   Others are:

ftp://ftp.sai.msu.su/
ftp://ftp.radio-msu.net/
ftp://ftp.relcom.ru/pub/
ftp://ftp.sunet.se/pub/
ftp://ftp.cs.wisc.edu/
ftp://ftp.cert.org/pub/

   I've tested these sites with FTP clients (Midnight Commander, Netscape
Navigator, ncftp) - all are accessible. It seems like a bug or bugs in
ftplib.
   The first two are very near to me in terms of Internet distance (hop
counts), so timeouts should not be a problem. These are sites near my ISP
(Radio MSU, in Moscow State University).

   Can anyone with better knowledge of FTP protocol look and help? Does
latest python (from CVS) perform better (if anyone willing to test)?

Oleg.
---- 
    Oleg Broytmann      Foundation for Effective Policies      phd at phd.russ.ru
           Programmers don't die, they just GOSUB without RETURN.





More information about the Python-list mailing list