How to force a thread to stop

Thu Jul 27 03:21:33 EDT 2006

Carl J. Van Arsdall wrote:
> bryanjugglercryptographer at yahoo.com wrote:
> > Carl J. Van Arsdall wrote:
> >
> >> bryanjugglercryptographer at yahoo.com wrote:
> >>
> >>> Carl J. Van Arsdall wrote:
> >>>
> >>> I don't get what threading and Twisted would to do for
> >>> you. The problem you actually have is that you sometimes
> >>> need terminate these other process running other programs.
> >>> Use spawn, fork/exec* or maybe one of the popens.
> >>>
> >>>
> >> I have a strong need for shared memory space in a large distributed
> >> environment.
> >>
> >
> > Distributed shared memory is a tough trick; only a few systems simulate
> > it.
> >
> Yea, this I understand, maybe I chose some poor words to describe what I
> wanted.

Ya' think?  Looks like you have no particular need for shared
memory, in your small distributed system.

> I think this conversation is getting hairy and confusing so  I'm
> going to try and paint a better picture of what's going on.  Maybe this
> will help you understand exactly what's going on or at least what I'm
> trying to do, because I feel like we're just running in circles.
[...]

So step out of the circles already. You don't have a Python thread
problem. You don't have a process overhead problem.

[...]
> So, I have a distributed build system. [...]

Not a trivial problem, but let's not pretend we're pushing the
state of the art here.

Looks like the system you inherited already does some things
smartly: you have ssh set up so that a controller machine can
launch various build steps on a few dozen worker machines.

[...]
> The threads invoke a series
> of calls that look like
>
> os.system(ssh <host> <command>)
>
> or for more complex operations they would just spawn a process that ran
> another python script)
>
> os.system(ssh <host> <script>)
[...]
> Alright, so this scheme that was first put in place kind of worked.
> There were some problems, for example when someone did something like
> os.system(ssh <host> <script>)  we had no good way of knowing what the
> hell happened in the script.

Yeah, that's one thing we've been telling you. The os.system()
function doesn't give you enough information nor enough control.
Use one of the alternatives we've suggested -- probably the
subprocess.Popen class.

[...]
> So, I feel like I have a couple options,
>
>  1) try moving everything to a process oriented configuration - we think
> this would be bad, from a resource standpoint as well as it would make
> things more difficult to move to a fully distributed system later, when
> I get my army of code monkeys.
>
> 2) Suck it up and go straight for the distributed system now - managers
> don't like this, but maybe its easier than I think its going to be, I dunno
>
> 3) See if we can find some other way of getting the threads to terminate.
>
> 4) Kill it and clean it up by hand or helper scripts - we don't want to
> do this either, its one of the major things we're trying to get away from.

The more you explain, the sillier that feeling looks -- that those
are your options. Focus on the problems you actually have. Track
what build steps worked as expected; log what useful information
you have about the ones that did not.

That "resource standpoint" thing doesn't really make sense. Those
os.system() calls launch *at least* one more process. Some
implementations will launch a process to run a shell, and the
shell will launch another process to run the named command. Even
so, efficiency on the controller machine is not a problem given
the scale you have described.

-- 
--Bryan