Simple thread pools

Josiah Carlson jcarlson at uci.edu
Mon Nov 8 19:22:52 EST 2004


Steve Holden <steve at holdenweb.com> wrote:

[snip prior portions of the conversation as it was getting long]

> It's not particularly surprising that communicating the same amount of 
> information across more threads (and pipelines) on the same machine 
> shows the thread-management activity starting to become significant.
> 
> However, in the case where I'm trying to send customer statements out by 
> email I still maintain that it's quicker (i.e. a given number of mails 
> will be sent out in less elapsed time) to have 200 threads running in 
> parallel (each typically communicating with a separate mail server) than 
> it is to use (say) 30 threads.
> 
> While I agree that overall I may end up using more local CPU, I'm happy 
> to use it because it means I can send over 10,000 emails an hour. Are 
> you suggesting it would go more quickly with fewer threads? This 
> certainly contradicts my testing results.
> 
> Although your program imports the socket library it doesn't appear to 
> use it, so I remain unconvinced of what you say. I do accept that we may 
> be talking at cross purposes, however, since I'm unable to get 
> www.pycs.net to respond and show me the original code on which the OP's 
> question was based.

I had initially planned to create a listening socket, and generate a
bunch of local sockets, then I remembered os.pipe and said to myself, "to
hell with it, pipes should be faster, they bypass the network stack".


As they sometimes say, "there is more than one way to skin a cat",
though let us hope that there isn't any cat skinning.

If your processor spends time maxed out by your script, then you may do
better by reducing threads (processor limited, and not bandwidth/latency
limited).  As thread count increases, you spend more processor handling
overhead.  If it isn't maxed out, and you are running at the file handle
limit and/or the bandwidth limit, congrats.


Now, just because you are using fewer threads, doesn't mean that you
can't get equivalent throughput.  Heck, using a heavily modified variant
of asyncore, we've been able to handle 50,000 POP3 account checks (login,
stat, and if necessary: list, uidl, download email, delete email,
disconnect) every 15 minutes from a laptop.  Our biggest issue is
latency of our connection, but even then, we do well considering that
this is all with a single thread.

 - Josiah




More information about the Python-list mailing list