Simple thread pools

Mon Nov 8 15:33:43 EST 2004

Josiah Carlson wrote:

> Steve Holden <steve at holdenweb.com> wrote:
> 
>>Josiah Carlson wrote:
>>
>>
>>>Jacob Friis <lists at debpro.webcom.dk> wrote:
>>>
>>>
>>>>I have built a script inspired by a post on Speno's Pythonic Avocado:
>>>>http://www.pycs.net/users/0000231/weblog/2004/01/04.html#P10
>>>>
>>>>I'm setting NUM_FEEDERS to 1000.
>>>>Is that crazy?
>>>
>>>
>>>Not crazy, but foolish.  Thread scheduling in Python reduces performance
>>>beyond a few dozen threads.  If you are doing system calls (socket.recv,
>>>file.read, etc.), your performance will be poor.
>>>
>>
>>Is this speculative, or do you have some hard evidence to support it? I 
>>recently rewrote a billing program that delivers statements by email. 
>>The number of threads it uses is a parameter to the program, and we are 
>>currently running at 200 with every evidence of satisfaction - this 
>>month's live run sent something over 10,000 emails an hour.
> 
> 
> There is a slowdown (perhaps 'poor' was a bad description).
> 
Well, I suspect it didn't adequately convey the meaning you wanted to 
communicate.

> 
>>>>i = 1
>>>>import os
>>>>while i < 256:
> 
> ...     t = os.system('test_thread1.py %i'%i
> ...     i *= 2
> ...
> 0.0 8.45300006866 204800000
> 0.0 7.625 204800000
> 0.0 9.65600013733 204800000
> 0.0150001049042 11.2969999313 204800000
> 0.0159997940063 15.8280000687 204800000
> 0.0780000686646 16.6719999313 204800000
> 0.172000169754 17.2029998302 204734464
> 0.125 18.7189998627 204734464
> 
> 
> Back in the days of Python 2.0, I had written what would now be called a
> P2P framework.  I initially used blocking threads for communication, and
> observed that as my number of connections and threads increased, I saw a
> marked reduction in throughput, and an increase in latency (even on a
> local machine).  In switching to an asynchronous framework (heavily
> derived from asyncore), I ended up with a system that had nearly constant
> throughput regardless of the number of connections.
> 
> 
> 
> 
>>>>Are there a better solution?
>>>
>>>
>>>Fewer threads.  Try running at 10-30.  If you are finding that you
>>>aren't able to handle the load with those threads, then your
>>>processor/disk/etc isn't fast enough to handle the load.
>>>
>>
>>I'm tempted to say "rubbish", but that would be rude, so instead I'll 
>>just ask for some evidence :-). Don't forget that in network-based tasks 
>>the time spent waiting for connection turnarounds can dominate the 
>>elapsed time for execution - did you perhaps overlook that?
> 
> 
> Evidence has been provided.
> 
>  - Josiah
> 
> 
> 
> #test_thread1.py
> import socket
> import time
> import threading
> import sys
> import os
> 
> paircount = int(sys.argv[1])
> c = threading.Condition()
> 
> l = threading.Lock()
> ds = 0L
> 
> def reader(n, p):
>     o_r = os.read
>     global ds
>     c.acquire()
>     c.wait()
>     c.release()
>     ld = 0
>     for i in xrange(n):
>         ld += len(o_r(p, 1024))
>     l.acquire()
>     ds += ld
>     l.release()
> 
> s = 1024*'\0'
> def writer(n, p):
>     o_w = os.write
>     global ds
>     c.acquire()
>     c.wait()
>     c.release()
>     ld = 0
>     for i in xrange(n):
>         ld += o_w(p, s)
>     l.acquire()
>     ds += ld
>     l.release()
> 
> 
> count = 100000
> blks = count/paircount
> for i in xrange(paircount):
>     r,w = os.pipe()
>     threading.Thread(target=reader, args=(blks, r)).start()
>     threading.Thread(target=writer, args=(blks, w)).start()
> 
> time.sleep(1)
> t = time.time()
> c.acquire()
> c.notifyAll()
> c.release()
> print time.time()-t,
> t = time.time()
> while len(threading.enumerate()) > 1:
>     time.sleep(.05)
> print time.time()-t, ds
> 
> 
> 
It's not particularly surprising that communicating the same amount of 
information across more threads (and pipelines) on the same machine 
shows the thread-management activity starting to become significant.

However, in the case where I'm trying to send customer statements out by 
email I still maintain that it's quicker (i.e. a given number of mails 
will be sent out in less elapsed time) to have 200 threads running in 
parallel (each typically communicating with a separate mail server) than 
it is to use (say) 30 threads.

While I agree that overall I may end up using more local CPU, I'm happy 
to use it because it means I can send over 10,000 emails an hour. Are 
you suggesting it would go more quickly with fewer threads? This 
certainly contradicts my testing results.

Although your program imports the socket library it doesn't appear to 
use it, so I remain unconvinced of what you say. I do accept that we may 
be talking at cross purposes, however, since I'm unable to get 
www.pycs.net to respond and show me the original code on which the OP's 
question was based.

regards
  Steve
-- 
http://www.holdenweb.com
http://pydish.holdenweb.com
Holden Web LLC +1 800 494 3119