[Python-ideas] pool threads

Scott Dial scott+python-ideas at scottdial.com
Tue Nov 16 11:19:22 CET 2010


On 11/9/2010 11:07 PM, Guido van Rossum wrote:
> Have you looked at PEP 3148?

I have been rewriting a web interface for a cluster of servers that I
wanted to do batch queries in parallel and the first that came to mind
was to try out this futures module (I'm using the version from pypi
w/Python 2.6).

I had some difficulty getting started using the module due to a lack of
examples of all the different ways of using pools (in particular, there
is no example showing something like what Kristján wants -- a
set-it-and-forget-it pool). However, once I got past the learning curve,
I made some code that does what I want.

Despite the code working, it wasn't as fast as I thought it should be,
so I started profiling the code, and I noticed some craziness. Looking
closer at it, I hardly believe anyone else has used this for anything
but toy examples. For clarity, the code I am profiling is basically this:

executor = futures.ThreadPoolExecutor(len(servers))
futures_list = []
for address in servers:
    def call(address=address):
        client = None
        load = None
        try:
            client = Client(address)
            load = client.load()
        finally:
            if client:
                client.close()
        return (address, load)
    futures_list.append(executor.submit(call))

c.servers = []
for future in futures.as_completed(futures_list):
    result = future.result()
    if result:
        address, load = result
        c.servers.append({'server': address, 'load': load})

Running this yields an insane number of wait()s on an Event() object.

.../futures/_base.py:149 as_completed x36     2673.00ms
.../threading.py:391     wait         x421111 1720.10ms

Looking closer at as_completed(), the call to wait is wrong:

            waiter.event.wait(timeout)

should be:

            waiter.event.wait(wait_timeout)

However, that isn't my problem. More importantly, the Event() itself
represents the wrong synchronization primitive:

        waiter = _create_and_install_waiters(fs, FIRST_COMPLETED)

Creates an Event() that is set() when the first future in fs completes,
which then spins as_completed() as the event will stay set forever.

Obviously, I will file a bug report (w/patch) about this, but I post my
experience here because it's pretty much exemplifies why I believe even
small libraries should see use in the wild before being tossed into the
stdlib -- I'm glad 3.2 is only alpha. Might be nice if the release page
mentioned this module to encourage more use of it.

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu



More information about the Python-ideas mailing list