[Tutor] running multiple concurrent processes

Tue Oct 30 22:10:17 CET 2012

Oscar, thanks for the link, though I must say with all due respect, if it
was "obvious" I wouldn't have had to ask the question. Good link though. I
suspect the reason I didn't find it is I did my searches under threading as
opposed to multi-processing.

Dave, no offense taken, great write-up. Now I understand more, but am quite
possible even more confused, but will take what you wrote to guide me
through further research. This is enough to get me started though. I'm
thinking multi-threading might be the way to go, but I want to re-read the
docs on multi-processing. Our internet went out for a while so I was
reading that on my phone, want to read again on a big screen.

For the record, I'm running Ubuntu Linux 12.04 (not quite a noob anymore,
been on Ubuntu since Maverick, pretty handy with the shell) on an HP G62
which has a 2.5GHz dual core Turion II (upgraded to 8 gig memory if that
matters, I'm thinking it doesn't).

I'm also going to investigate running it as a monolithic program instead of
several smaller ones running concurrently as suggested by another poster.
Frankly I don't think that would work as well, and it would involve a great
many loops that might erroneously end up nigh-continuous, so I see danger
there, but will investigate.

I was at one point actually thinking of setting things up on Timer from the
threading module, so that parameters would update either event driven (ex:
a sensor crossing a threshold) or time driven (ex: poll sensors every 30
seconds). That seems doable, but poor form to me. Rather I should say that
seems a low level way out that is attractive only due to my fairly low
level of Python knowledge at this point. Either threading or
multi-processing seems both more elegant and more efficient.

Might you explain a little more in depth on the pitfalls of threading, and
also the event loop you mentioned, or point me towards some resources? As I
mentioned earlier, I did read the threading docs, was pretty lost. I'm no
idiot, but I'm a public school math teacher trying learn this stuff to help
engage my students, not a degreed computer science major or full time
programmer.

As always I appreciate the responses and rich variety of help therein.

regards, Richard

I'm only guessing about your background, so please don't take offense at

> the simple level of the following.  You see, before you can really
> understand how the language features work, and what the various terms
> mean, you need to understand the processor and the OS.
>
> A decade or so ago, things were a bit simpler -- if we wanted a faster
> machine, Intel would crank up the processor clock rate, and things were
> faster.  But eventually, it reached the point where increased clock rate
> became VERY expensive, and Intel (and others) came up with a different
> strategy.
>
> I'm going to guess you're running on some variant of the Pentium
> processor.  The processor (cpu) has a feature called hyperthreading,
> meaning that for most operations, it can do two things at once.  So it
> has two copies of the instruction pointer, and two copies of most
> registers.  As long as neither program uses the features that aren't
> replicated, you can run two programs completely independently.  The two
> programs share physical memory, hard disk, keyboard and screen, but they
> probably won't slow each other down very much.
>
> You may have a dual-core, or even a quad-core processor.  And you may
> have more than one of those, if you're on a high-end server.  So, as
> long as the processes are separate, you could run many of them at a time.
>
> The other thing that affects all of this is the operating system you're
> running.  It has to manage these multiple processes, and make sure that
> things that can't be shared are correctly serialized;  one task grabs a
> resource   and others block waiting for that resource.  The most visible
> (but not the most important) way this occurs is that separate
> applications draw in different windows.  They share the screen, but none
> of them writes to the raw device, all of them go through a window manager.
>
> This is multiprocessing.  And since one program can launch others, it's
> one way that a single "task" can be split up to use these multiple
> cores/cpus.  The operating system deliberately keeps the separate
> processes very isolated, but provides a few ways for them to talk to
> each other:  (one program can launch another, passing it arguments, and
> observing the return code, it can also use pipes to connect to stdin and
> stdout of the other program, they can open up queues, shared memory, or
> they can each read & write to a common file.)  Such processes do NOT
> normally share variables, and function calls on one do not easily end up
> invoking code in the other.
>
> But there is a second way that two cpus can work on the same "task."  If
> a single process is multi-THREADED, then the threads do share variables
> and other resources, and communication between them is easy (so easy
> it's difficult to get right, actually).  This is theoretically much
> gentler on system resources, but at the cost of lots more bugs likely.
>
> Some operating systems have a feature called forking, which can
> theoretically give you the best of both worlds.  But I'm not going to
> even try to explain that unless you tell me you're on a Linux or Unix
> type operating system.  Besides, I don't know how any of Python uses
> such a fork;  it hasn't turned out to be necessary information for me yet.
>
> Now, with CPython in particular, multithreading has a serious problem,
> the global lock (GIL).  Since so much happens behind the scenes inside
> the interpreter and low-level library routines, and perhaps since most
> of that was written before multithreading was supported, there's a
> single lock that permits only one thread of a process to be working at a
> time.  So if you break up a CPU-bound task into multiple threads, only
> one will run at a time, and chances are it'll run slower than if it only
> had one thread.
>
> Two things happen to make the GIL less painful (it's really just two
> manifestations of the same thing).  Many times when a thread is in C
> code, or when it is calling some system function that blocks (eg.
> waiting for a network message), the GIL is deliberately released, and
> other threads CAN run.  So writing a server that waits on many sockets,
> one per thread, can make good sense, both from code simplicity and from
> performance considerations.
>
> One other thing that's related.  Most gui programs run with an event
> loop, which is another type of multithreading that does NOT use any
> special cpu or OS features.  With an event loop, it's your job to make
> sure all transactions are reasonably small, and that each is triggered
> by some event.  Once you understand event loops, it's simpler than
> either of the other approaches.   Note that sometimes two or three of
> these approaches are combined in one system.
>
> Hope this helps, and that some of it was useful.  I know that in places
> I oversimplified, but I think I caught the spirit of the tradeoffs.
>
>
> --
>
> DaveA
>
>

-- 

sic gorgiamus allos subjectatos nunc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20121030/081e3495/attachment-0001.html>