Question about thread

Fri Nov 19 10:10:53 EST 2004

On Fri, 19 Nov 2004 22:50:17 +0800, Valkyrie <valkyrie at cuhk.edu.hk> wrote:
>To be more precise, what I want to do is to have a threaded program to handle
> some jobs concurrently (while those jobs are fetching  information from the
> Internet, so threading could speed up the process if the network is slow)
> 
>             start
>               | (blahblahblah...)
>               v
>   +-----+-----+-----+-----+
>   |     |     |     |     |
> --+-- --+-- --+-- --+-- --+--
> |   | |   | |   | |   | |   |
> | A | | B | | C | | D | | E |
> |   | |   | |   | |   | |   |
> --+-- --+-- --+-- --+-- --+--
>   |     |     |     |     |
>   +-----+-----+-----+-----+
>               | (blahblahblah...)
>               v
>            finish!

  If your goal is efficient network concurrency, threads are a second-rate solution.  Asynchronous IO is the winner: http://www.twistedmatrix.com/

  Here's an example (untested as usual):

    from twisted.web.client import downloadPage
    from twisted.internet import defer, reactor

    # Map URLs to files to which to save them
    URLs = {'www.google.com', 'google',
            # ...
            'wigu.com', 'wigu',
           }

    # Initiate the download and save of each page
    downloads = []
    for url in URLs:
        downloads.append(downloadPage(url, open(URLs[url], 'w')))

    # Wait for all of the downloads to finish, then stop
    defer.DeferredList(downloads).addCallback(lambda r: reactor.stop())

    # Start the reactor
    reactor.run()

  Jp