Any python scripts to do parallel downloading?

Carl J. Van Arsdall cvanarsdall at mvista.com
Wed Jan 31 14:31:41 EST 2007


Michele Simionato wrote:
> On Jan 31, 5:23 pm, "Frank Potter" <could.... at gmail.com> wrote:
>   
>> I want to find a multithreaded downloading lib in python,
>> can someone recommend one for me, please?
>> Thanks~
>>     
>
> Why do you want to use threads for that? Twisted is the
> obvious solution for your problem, but you may use any
> asynchronous framework, as for instance the good ol
>   
Well, since it will be io based, why not use threads?  They are easy to 
use and it would do the job just fine.  Then leverage some other 
technology on top of that.

You could go as far as using wget via os.system() in a thread, if the 
app is simple enough. 

def getSite(site):
  os.system('wget %s',site)
 
threadList =[]
for site in websiteList:
   threadList.append(threading.Thread( target=getSite,args=(site,)))

for thread in threadList:
   thread.start()

for thread in threadList:
   thread.join()

> Tkinter:
>
> """
> Example of asynchronous programming with Tkinter. Download 10 times
> the same URL.
> """
>
> import sys, urllib, itertools, Tkinter
>
> URL = 'http://docs.python.org/dev/lib/module-urllib.html'
>
> class Downloader(object):
>     chunk = 1024
>
>     def __init__(self, urls, frame):
>         self.urls = urls
>         self.downloads = [self.download(i) for i in range(len(urls))]
>         self.tkvars = []
>         self.tklabels = []
>         for url in urls:
>             var = Tkinter.StringVar(frame)
>             lbl = Tkinter.Label(frame, textvar=var)
>             lbl.pack()
>             self.tkvars.append(var)
>             self.tklabels.append(lbl)
>         frame.pack()
>
>     def download(self, i):
>         src = urllib.urlopen(self.urls[i])
>         size = int(src.info()['Content-Length'])
>         for block in itertools.count():
>             chunk = src.read(self.chunk)
>             if not chunk: break
>             percent = block * self.chunk * 100/size
>             msg = '%s: downloaded %2d%% of %s K' % (
>                 self.urls[i], percent, size/1024)
>             self.tkvars[i].set(msg)
>             yield None
>         self.tkvars[i].set('Downloaded %s' % self.urls[i])
>
> if __name__ == '__main__':
>     root = Tkinter.Tk()
>     frame = Tkinter.Frame(root)
>     downloader = Downloader([URL] * 10, frame)
>     def next(cycle):
>         try:
>             cycle.next().next()
>         except StopIteration:
>             pass
>         root.after(50, next, cycle)
>     root.after(0, next, itertools.cycle(downloader.downloads))
>     root.mainloop()
>
>
>     Michele Simionato
>
>   


-- 

Carl J. Van Arsdall
cvanarsdall at mvista.com
Build and Release
MontaVista Software




More information about the Python-list mailing list