threading.RLock not subclassible?

Wed Feb 7 22:37:54 EST 2001

Rick Lee <rwklee at home.com> writes:

> "A multithreaded program executes by dividing its processing time
> between all active threads.  For example, a program with 10 active
> threads of execution would allow approximately 1/10 of its CPU time
> to each thread and cycle between threads in rapid succession."
> 
> I certainly was not seeing this behaviour on some platforms, even
> for threads that take a long time to perform computations.

Well, that may be a different issue than your posted point about the
active thread count.  It is possible that depending on the local
thread implementation (at the process level, OS kernel level,
pre-emptive versus cooperative, etc...) that you may get better or
worse sharing.  Really the only bad case is if the threads are
cooperative (that is, a thread has to yield before the scheduler moves
to another thread).  I'm not sure which if any platforms fall into
that category (perhaps someone else can answer for the typical Unix
pthreads implementation).

How were you measuring the division of labor amongst the threads?

> - what is the thread switching performance penalty, and how sensitive is the
> penalty to the number of threads?

There's a hit to the Python interpreter to support threading (I think
I recall seeing something like 10%) in general, but actually switching
between the threads should be quite low overhead.  Certainly when
threads are native to the platform's OS (such as with Windows NT), the
system scheduler really only thinks in terms of threads anyway, so
thread context switches are just the normal mode of dispatching.

It should also be fairly insensitive to the number of threads,
providing overall system resources are not exceeded.

> - is there an upper limit to the number of threads?

That depends on the platform - you'll definitely run out of resources
eventually, but it's sort of like asking what's the upper limit on the
number of processes on the system, which can vary.

It also depends on what they are doing, since you can likely support
far more threads if the majority are blocked, then if you are trying
to share the CPU among them all as active threads.

Certainly I would expect that hundreds should not be out of the
question, and probably thousands.  My desktop NT machine just with a
normal mix of stuff active sits with 42 processes and about 250
threads, and I expect it could easily handle an order of magnitude
higher - as long as they weren't all trying to run simultaneously.

But I don't have a hard limit for you, nor do I personally know of
such values across all platforms.  It's a per-platform resource issue
and is related to the activity of the code as well, so I'm not sure
there's a fixed answer.  Certainly if you're thinking in terms of
thousands you may want to find a way to share them - but if you're
thinking 10s or hundreds, you're probably ok.

> Regarding what happens on the Mac: I have a multi-threaded program
> with several threads, each of which blocks without timeout on
> sockets.accept, sockets.recv, and queues.get respectively.  This
> program runs perfectly on NT and Linux, but the blocked threads on
> the Mac don't seem to execute when the blocking condition is removed
> (I am 99% sure that's what's happening).  If anyone can shed light
> on this, I will be very grateful.

By blocking condition removed, do you mean that data shows up on the
socket or queues?  I really can't think why if such data were to
arrive that the system calls wouldn't be satisfied.  I don't see the
Mac implementation in the CVS tree I've currently got checked out, so
I'm not sure how the Mac threading is implemented - perhaps someone
else could chime in with some info.

> Another weirdness includes a bit of code like this, which is a
> recursive call on itself to create threads:

Well, my first inclination is that the code itself appears dangerous -
and unpredictable - because you don't have any protection against
simultaneous access to the shared objects (e.g., runners).  But it's
hard to say without seeing more of the surrounding code.  But for
example, it would appear that by the time you get around to appending
"me" to your completedRunObj, it's probably been changed by a new
thread.

Personally, if you're trying to observe behavior of threads, I find
that a few judicial print statements can help out a lot.  Yes, if your
problems have a time component, the additional time to do the output
can interfere, but for basic scheduling (what thread ran when) it can
be enlightening to just print the names of threads and points of their
execution all in sequence.  If you keep your output as a single string
to the "print" statement, it won't get split amongst different threads
either, which can be a problem sometimes with this approach.

--
-- David
-- 
/-----------------------------------------------------------------------\
 \               David Bolen            \   E-mail: db3l at fitlinxx.com  /
  |             FitLinxx, Inc.            \  Phone: (203) 708-5192    |
 /  860 Canal Street, Stamford, CT  06902   \  Fax: (203) 316-5150     \
\-----------------------------------------------------------------------/