Efficiency, threading, and concurrent.futures

Wed Aug 20 13:17:28 EDT 2014

Rob Gaddi <rgaddi at technologyhighland.invalid> writes:

> I've got a situation where I'll be asking an I/O bound process to do
> some work (querying an RS-232 device) while my main code is off
> running a sleep() bound process.  Everyone always talks about how
> expensive thread creation is, so I figured I'd test it out in an
> IPython notebook. 
>
> #####
>
> import threading
> from concurent.futures import ThreadPoolExecutor as TPE
> from time import sleep
>
> def fn():
>   sleep(0.001)
>
> %%timeit -r 50 -n 1000
> thr = threading.Thread(target=fn)
> thr.start()
> thr.join()
> 1000 loops, best of 5: 1.24 ms per loop
>
> %%timeit -r 50 -n 1000 ex=TPE(1)
> fut=ex.submit(fn)
> fut.result()
> 1000 loops, best of 5: 1.26 ms per loop
>
> #####
>
> Now, my understanding is that the ThreadPoolExecutor spawns all its
> threads at the outset, then stuffs requests into one queue and
> fishes results out of another, which should be substantially faster than
> having create new threads each time.  And yet those were pretty dead on
> even. Any idea what I'm seeing here?

To see any difference, you should submit more than one job per worker to
ThreadPoolExecutor and avoid waiting for the each result synchronously.

I don't know whether ThreadPoolExecutor starts all workers at once in
the current CPython implementation. The name max_workers suggests that
it may start them as needed.

--
Akira