Multiple thread program problem

Cameron Simpson cs at zip.com.au
Thu Jun 4 00:42:22 EDT 2015


On 03Jun2015 19:59, M2 <mohan.mohta at gmail.com> wrote:
>On Wednesday, June 3, 2015 at 7:38:22 PM UTC-5, Cameron Simpson wrote:
>> I would be passing only "line" to proc, not "f" at all.
>>
>> Suggestion: move your main code into its own function. That will make all the
>> variables in it "local". Your proc function is presently relying on "line"
>> being global, which generally bad and a recipe for disaster in multithreaded
>> code.
>>
>> Moving the main code into its own function will (1) get rid of the global
>> variables and (2) force you to consider exactly what you need to pass to
>> "proc", and that will help reveal various logic issues.
[...]
>
>Thanks Cameron.
>I do not see the duplication in the execution now.
>I do see it is not consistent by executing all the threads ; it might be due to the fact I am using
>        subprocess.call(co,shell=True)
>Per my understanding the above does not keep track of threads it just spawns a thread and leaves it there.

subprocess does not start a thread at all. It starts a new external process.  
Your variable "t" is the Thread.

>I might need to use the function start(), join() to ensure it picks up all the argument

Because you have called .start_new_thread, you do not need to call .start().

Whether you call .join() on the Thread later is up to you. When you call 
.join(), the caller will block until the Thread terminates. So you won't want 
to do that immediately if you hope to run several subprocesses in parallel with 
this setup.

Thank you for including your current code; most helpful. I have some other 
remarks, below the code which I'm keeping here for reference:

>For the record now my new code is
>#! /usr/bin/python
>import os
>import subprocess
>import thread
>import threading
>import sys
>from thread import start_new_thread
>
>def proc(col) :
>        subprocess.call(col,shell=True)
>        return
>
>f = open('/tmp/python/1')
>for line in f:
>        com1="ssh -B "
>        com2=line.strip('\n')
>        com3= " uname -a  "
>        co=str("ssh -B ")+ str(com2) + str(" uname -a")
>        t=thread.start_new_thread(proc,(co,))
>f.close()

First up, your open/read/close stuff is normally written like this in Python:

  with open('/tmp/python/1') as f:
    for line in f:
      ... loop body here ...

This is because the return from open() is in fact a "context manager", an 
object designed to work with the "with" statement. Specificly, the "exit" step 
of this particular context manager calls .close() for you. Aside from being 
more succinct, a context manager has the other advantage that the exit action 
is _always_ called when control leaves the scope of the "with". Consider this 
example function:

  def foo(filename):
    with open(filename) as f:
      for line in f:
        ... do stuff with line ...
        if "quit" in line:
          return

That will return from the function if the string "quit" occurs on one of the 
lines in the file, and not process any following lines. The important point 
here is that if you use "with" then the open file will be closed automatically 
when the return happens. With your original open/loop/close code, the "return" 
would bypass the .close() call, leaving the file open.

The next remark I would make is that while Threads are very nice (I use them a 
lot), if all your thread is doing is dispatching a subprocess then you do not 
need a thread. The separate process that is made by subprocess runs 
automatically, immediately, without waiting for your program. So you can invoke 
a bunch of subprocesses in parallel, and not go near a thread.

However, because you have invoked the subprocess with .call(), your "proc" 
function inherently waits for the subprocess to complete before returning and 
therefore you need a thread to do these in parallel.

The .call() function is a convenience function; the alternative is to use the 
.Popen() constructor directly:

  def proc(col):
    P = subprocess.Popen(col, shell=True)
    return P

Then you can have your main loop replace "t=..." with:

  P = proc(co)

At this point, proc starts the subprocess but does not wait for it (versus 
"call", which does the wait for you). So you can dispatch all these 
subprocesses in parallel. Then after your loop you can wait for them to finish 
at your leisure.

Cheers,
Cameron Simpson <cs at zip.com.au>

Nonsense. Space is blue, and birds fly through it.      - Heisenberg



More information about the Python-list mailing list