How to schedule system calls with Python

Wed Oct 21 21:03:02 EDT 2009

Jorgen Grahn wrote:
> On Fri, 2009-10-16, Jeremy wrote:
>> On Oct 15, 6:32 pm, MRAB <pyt... at mrabarnett.plus.com> wrote:
>>> TerryP wrote:
>>>> On Oct 15, 7:42 pm, Jeremy <jlcon... at gmail.com> wrote:
>>>>> I need to write a Python script that will call some command line
>>>>> programs (using os.system).  I will have many such calls, but I want
>>>>> to control when the calls are made.  I won't know in advance how long
>>>>> each program will run and I don't want to have 10 programs running
>>>>> when I only have one or two processors.  I want to run one at a time
>>>>> (or two if I have two processors), wait until it's finished, and then
>>>>> call the next one.
> ...
>>> You could use multithreading: put the commands into a queue; start the
>>> same number of worker threads as there are processors; each worker
>>> thread repeatedly gets a command from the queue and then runs it using
>>> os.system(); if a worker thread finds that the queue is empty when it
>>> tries to get a command, then it terminates.
>> Yes, this is it.  If I have a list of strings which are system
>> commands, this seems like a more intelligent way of approaching it.
>> My previous response will work, but won't take advantage of multiple
>> cpus/cores in a machine without some manual manipulation.  I like this
>> idea.
> 
> Note that you do not need *need* multithreading for this. To me it
> seems silly to have N threads sitting just waiting for one process
> each to die -- those threads contribute nothing to the multiprocessing
> you want.
> 
> In Unix, you can have one process fork() and exec() as many programs
> as you like, have them run on whatever CPUs you have, and wait for
> them to die and reap them using wait() and related calls.  (Not sure
> what the equivalent is in non-Unix OSes or portable Python.)
> 
> /Jorgen
> 

Another way to approach this, if you do want to use threads, is to use a 
counting semaphore. Set it to the maximum number of threads you want to 
run at any one time. Then loop starting up worker threads in the main 
thread. acquire() the semaphore before starting the next worker thread; 
when the semaphore reaches 0, your main thread will block. Each worker 
thread should then release() the semaphore when it  exits; this will 
allow the main thread to move on to creating the next worker thread.

This doesn't address the assignment of threads to CPU cores, but I have 
used this technique many times, and it is simple and fairly easy to 
implement. You have to make sure, though, that you catch all exceptions 
in the worker threads; if a thread exits without releasing the 
semaphore, you will have a "semaphore leak". And, of course, there are 
subtleties concerning threading that you always have to worry about, 
such as using a mutex, for instance, around any print statements so the 
various thread outputs don't mess each other up.