Python FTP timeout value not effective
Terry Reedy
tjreedy at udel.edu
Mon Sep 2 20:04:59 EDT 2013
On 9/2/2013 1:43 PM, John Nagle wrote:
> I'm reading files from an FTP server at the U.S. Securities and
> Exchange Commission. This code has been running successfully for
> years. Recently, they imposed a consistent connection delay
> of 20 seconds at FTP connection, presumably because they're having
> some denial of service attack. Python 2.7 urllib2 doesn't
> seem to use the timeout specified. After 20 seconds, it
> gives up and times out.
>
> Here's the traceback:
>
> Internal error in EDGAR update: <urlopen error ftp error: [Errno 110]
> Connection timed out>
> ....
> File "./edgar/edgarnetutil.py", line 53, in urlopen
>
> File "/opt/python27/lib/python2.7/socket.py", line 571, in
> create_connection
...
> raise err
> URLError: <urlopen error ftp error: [Errno 110] Connection timed out>
>
> Periodic update completed in 21.1 seconds.
> ----------------------------------------------
>
> Here's the relevant code:
>
> TIMEOUTSECS = 60 ## give up waiting for server after 60 seconds
> ...
> def urlopen(url,timeout=TIMEOUTSECS) :
> if url.endswith(".gz") : # gzipped file, must decompress first
> nd = urllib2.urlopen(url,timeout=timeout) # get connection
> ... # (NOT .gz FILE, DOESN'T TAKE THIS PATH)
> else :
> return(urllib2.urlopen(url,timeout=timeout)) # (OPEN FAILS)
I looked at the 3.3 urllib.retrieve.urlopen code and timeout is passed
through a couple of layers but is it hard to see if it reaches the
socket connection call. I would also try python3.3 as timeout may have
been changed a bit.
There are some 'timeout' issues on the tracker, such as
http://bugs.python.org/issue4079
http://bugs.python.org/issue18417
but these do not obviously apply to an explicitly passed timeout
I would also try using ftplib, which cuts out lots of the general
purpose layers urlopen. FTP.__init__ stores timeout in self.timeout and
calls connect(), which passes self.timeout to socket.create_connection.
>>> import ftplib
>>> ftp = ftplib.FTP("ftp.sec.gov")
>>> ftp.login()
'230-Anonymous access granted, restrictions apply\n \n Please read the
file README.txt\n230 it was last modified on Tue Aug 15 14:29:31 2000
- 4765 days ago'
>>> ftp.sendcmd('help')
"214-The following commands are recognized (* =>'s unimplemented):\n CWD
XCWD CDUP XCUP SMNT* QUIT PORT PASV \n EPRT
EPSV ALLO* RNFR RNTO DELE MDTM RMD \n XRMD MKD
XMKD PWD XPWD SIZE SYST HELP \n NOOP FEAT
OPTS AUTH* CCC* CONF* ENC* MIC* \n PBSZ* PROT* TYPE
STRU MODE RETR STOR STOU \n APPE REST ABOR
USER PASS ACCT* REIN* LIST \n NLST STAT SITE MLSD
MLST \n214 Direct comments to root at clone11.sec.gov"
I tried to read 'README.txt but I do not know how to use the commands or
local FTP methods.
> TIMEOUTSECS used to be 20 seconds, and I increased it to 60. It didn't
> help.
>
> This isn't an OS problem. The above traceback was on a Linux system.
> On Windows 7, it fails with
>
> "URLError: <urlopen error ftp error: [Errno 10060] A connection attempt
> failed because the connected party did not properly respond after a
> period of time, or established connection failed because connected host
> has failed to respond>"
>
> But in both cases, the command line FTP client will work, after a
> consistent 20 second delay before the login prompt. So the
> Python timeout parameter isn't working.
--
Terry Jan Reedy
More information about the Python-list
mailing list