Recommended number of threads? (in CPython)

Falcolas garrickp at gmail.com
Thu Oct 29 12:03:46 EDT 2009


On Oct 29, 9:56 am, mk <mrk... at gmail.com> wrote:
> Hello everyone,
>
> I wrote run-of-the-mill program for concurrent execution of ssh command
> over a large number of hosts. (someone may ask why reinvent the wheel
> when there's pssh and shmux around -- I'm not happy with working details
> and lack of some options in either program)
>
> The program has a working queue of threads so that no more than
> maxthreads number are created and working at particular time.
>
> But this begs the question: what is the recommended number of threads
> working concurrently? If it's dependent on task, the task is: open ssh
> connection, execute command (then the main thread loops over the queue
> and if the thread is finished, it closes ssh connection and does .join()
> on the thread)
>
> I found that when using more than several hundred threads causes weird
> exceptions to be thrown *sometimes* (rarely actually, but it happens
> from time to time). Although that might be dependent on modules used in
> threads (I'm using paramiko, which is claimed to be thread safe).

Since you're creating OS threads when doing this, your issue is
probably more related to your OS' implementation of threads than
Python. That said, several hundred threads, regardless of them being
blocked by the GIL, sounds like a recipe for trouble on most machines,
but as usual YMMV.

If you're running into problems with a large number of connections
(not related to a socket limit), you might look into doing it
asynchronously - loop over a list of connections and do non-blocking
reads to see if your command has completed. I've done this
successfully with pexpect, and didn't run into any issues with the
underlying OS.

Garrick



More information about the Python-list mailing list