how to start thread by group?

bieffe62 at gmail.com bieffe62 at gmail.com
Mon Oct 6 10:24:51 EDT 2008


On 6 Ott, 15:24, oyster <lepto.pyt... at gmail.com> wrote:
> my code is not right, can sb give me a hand? thanx
>
> for example, I have 1000 urls to be downloaded, but only 5 thread at one time
> def threadTask(ulr):
>   download(url)
>
> threadsAll=[]
> for url in all_url:
>      task=threading.Thread(target=threadTask, args=[url])
>      threadsAll.append(task)
>
> for every5task in groupcount(threadsAll,5):
>     for everytask in every5task:
>         everytask.start()
>
>     for everytask in every5task:
>         everytask.join()
>
>     for everytask in every5task:        #this does not run ok
>         while everytask.isAlive():
>             pass

Thread.join() stops until the thread is finished. You are assuming
that the threads
terminates exactly in the order in which are started. Moreover, before
starting the
next 5 threads you are waiting that all previous 5 threads have been
completed, while I
believe your intention was to have always the full load of 5 threads
downloading.

I would restructure my code with someting like this ( WARNING: the
following code is
ABSOLUTELY UNTESTED and shall be considered only as pseudo-code to
express my idea of
the algorithm (which, also, could be wrong:-) ):


import threading, time

MAX_THREADS = 5
DELAY = 0.01 # or whatever

def task_function( url ):
    download( url )

def start_thread( url):
    task=threading.Thread(target=task_function, args=[url])
    return task

def main():
    all_urls = load_urls()
    all_threads = []
    while all_urls:
        while len(all_threads) < MAX_THREADS:
            url = all_urls.pop(0)
            t = start_thread()
            all_threads.append(t)
        for t in all_threads
            if not t.isAlive():
                t.join()
                all_threads.delete(t)
        time.sleep( DELAY )


HTH

Ciao
-----
FB



More information about the Python-list mailing list