Python FTP timeout value not effective

Terry Reedy tjreedy at udel.edu
Mon Sep 2 20:04:59 EDT 2013


On 9/2/2013 1:43 PM, John Nagle wrote:
>      I'm reading files from an FTP server at the U.S. Securities and
> Exchange Commission.  This code has been running successfully for
> years.  Recently, they imposed a consistent connection delay
> of 20 seconds at FTP connection, presumably because they're having
> some denial of service attack.  Python 2.7 urllib2 doesn't
> seem to use the timeout specified.  After 20 seconds, it
> gives up and times out.
>
> Here's the traceback:
>
> Internal error in EDGAR update: <urlopen error ftp error: [Errno 110]
> Connection timed out>
> ....
>    File "./edgar/edgarnetutil.py", line 53, in urlopen
>
>    File "/opt/python27/lib/python2.7/socket.py", line 571, in
> create_connection
...
>      raise err
> URLError: <urlopen error ftp error: [Errno 110] Connection timed out>
>
> Periodic update completed in 21.1 seconds.
> ----------------------------------------------
>
> Here's the relevant code:
>
> TIMEOUTSECS = 60	## give up waiting for server after 60 seconds
> ...
> def urlopen(url,timeout=TIMEOUTSECS) :
>      if url.endswith(".gz") :	# gzipped file, must decompress first
>          nd = urllib2.urlopen(url,timeout=timeout)	# get connection
> 	... # (NOT .gz FILE, DOESN'T TAKE THIS PATH)
>      else :
> 	return(urllib2.urlopen(url,timeout=timeout)) # (OPEN FAILS)

I looked at the 3.3 urllib.retrieve.urlopen code and timeout is passed 
through a couple of layers but is it hard to see if it reaches the 
socket connection call. I would also try python3.3 as timeout may have 
been changed a bit.

There are some 'timeout' issues on the tracker, such as
http://bugs.python.org/issue4079
http://bugs.python.org/issue18417
but these do not obviously apply to an explicitly passed timeout

I would also try using ftplib, which cuts out lots of the general 
purpose layers urlopen. FTP.__init__ stores timeout in self.timeout and 
calls connect(), which passes self.timeout to socket.create_connection.

 >>> import ftplib
 >>> ftp = ftplib.FTP("ftp.sec.gov")
 >>> ftp.login()
'230-Anonymous access granted, restrictions apply\n \n Please read the 
file README.txt\n230    it was last modified on Tue Aug 15 14:29:31 2000 
- 4765 days ago'
 >>> ftp.sendcmd('help')
"214-The following commands are recognized (* =>'s unimplemented):\n CWD 
     XCWD    CDUP    XCUP    SMNT*   QUIT    PORT    PASV    \n EPRT 
EPSV    ALLO*   RNFR    RNTO    DELE    MDTM    RMD     \n XRMD    MKD 
    XMKD    PWD     XPWD    SIZE    SYST    HELP    \n NOOP    FEAT 
OPTS    AUTH*   CCC*    CONF*   ENC*    MIC*    \n PBSZ*   PROT*   TYPE 
    STRU    MODE    RETR    STOR    STOU    \n APPE    REST    ABOR 
USER    PASS    ACCT*   REIN*   LIST    \n NLST    STAT    SITE    MLSD 
    MLST    \n214 Direct comments to root at clone11.sec.gov"

I tried to read 'README.txt but I do not know how to use the commands or 
local FTP methods.

> TIMEOUTSECS used to be 20 seconds, and I increased it to 60. It didn't
> help.
>
> This isn't an OS problem. The above traceback was on a Linux system.
> On Windows 7, it fails with
>
> "URLError: <urlopen error ftp error: [Errno 10060] A connection attempt
> failed because the connected party did not properly respond after a
> period of time, or established connection failed because connected host
> has failed to respond>"
>
> But in both cases, the command line FTP client will work, after a
> consistent 20 second delay before the login prompt.  So the
> Python timeout parameter isn't working.


-- 
Terry Jan Reedy




More information about the Python-list mailing list