Speeding up ftp, async/threadpool?

Thu Feb 13 11:18:05 EST 2003

vvainio at tp.spt.fi (Ville Vainio) wrote in message news:<ad496f8.0302122343.2c3d794a at posting.google.com>...
> Trying to do a mget-tish thing with ftplib for a large number of small
> files, one can see how slow a protocol ftp is. I've thought of
> optimizing this with a threadpool (which is easy to implement), but
> OTOH it seems that an async ftp client could juggle hundreds of ftp
> downloads/uploads at the time, effectively saturating the network. How
> come such thing hasn't been implemented yet? I'm thinking of an API
> like:
> 
> a = asyncftp(connections=200)
> a.login("host1.com","myuid","mypasswd")
> a.login("host2.com","myuid2","mypasswd2")
> 
> a.getfile("host1.com","/home/myuid/foo.txt",target="/tmp/foo.txt")
> a.getfile("host2.com","/home/myuid/bar.txt",target="/tmp/bar.txt")
> 
> a.go()
> a.waitforall()

if you don't mind using an extension module, then perhaps the pycurl module
could be of help.  pycurl wraps the curl library and supports many protocols,
including ftp.  since all the transfer code is in c, pycurl is typically
faster than ftplib.

pycurl can be found at http://pycurl.sf.net/

have a look at the retriever.py example, which with some tweaking could be used
to download with ftp instead of http.

regards,

    - kjetil