Python threading (was: Re: global interpreter lock not working as it should)

Mon Aug 5 09:05:38 EDT 2002

On 5/8/2002 10:57, in article B9740AA3.F1B9%jonathan at onegoodidea.com,
"Jonathan Hogg" <jonathan at onegoodidea.com> wrote:

> The pthread_cond_signal is done by the code that releases the GIL. This code
> (which isn't quoted) grabs 'thelock->mutex', unsets 'thelock->locked',
> releases the mutex and then signals 'thelock->lock_released' allowing
> another thread to be unblocked (and possibly switched to depending on the
> scheduler).

Something else occurred to me while thinking about this. After saying before
that Python is unlikely to suffer from priority inversion, I've realised
that the GIL makes threaded programs actually very at risk of priority
inversion, if one writes threaded programs according to the normal wisdom.

Consider that you have some CPU-bound task that you write in C code in order
to run as fast as possible. Before getting underway, the code releases the
GIL so as to allow other Python threads to run. Assume that we're running
Python with realtime scheduling and have assigned fixed priorities to the
different tasks. The crunching thread is of some medium priority, we have
some I/O bound thread that is high priority, and some background
uninteresting thread that is low priority.

Let's say that the medium-priority thread is pounding away when it blocks
momentarily to save some results or some-such. At that moment with nothing
else to do, the scheduler invokes the low-priority thread to do some
cleaning up or something. The low-priority thread munches a bit but is then
pre-empted when the medium-priority thread completes its I/O. The
medium-priority thread doesn't require the GIL so it gets straight back to
work. The low-priority thread is still holding the GIL.

Now imagine that something important comes in and the high-priority thread
is awoken to deal with it. This thread, returning from its blocking I/O
operation, attempts to grab the GIL. The GIL is held already by the
low-priority thread so the high-priority thread immediately blocks.

Now here's the rub: the normal way of avoiding priority inversion here is to
raise the priority of the thread holding the lock; but the GIL isn't a lock
- at least not in the pthreads sense of the word. The GIL is a mutex
protected variable and a condition. The high-priority thread is blocked on a
condition, not on a lock. No-one "holds" a condition, there is no way of
knowing (from the scheduler's point of view) which thread is going to signal
it in the future, and so no way of knowing which thread to raise in
priority. Therefore, it's going to simply re-schedule the medium priority
thread.

Until the medium-priority thread is finished, or blocks again, the
high-priority thread is going to remain blocked on the low-priority thread,
unable to deal with the important I/O - a classic case of unbounded priority
inversion.

I would imagine that even with dynamic scheduling there would still be a
risk of very poor I/O latency (though not full inversion blocking) as,
depending on the relative priorities, there may be a random amount of
switching and processing that occurs before a thread that has awoken because
of some I/O actually manages to obtain the GIL and deal with it.

I don't have any solution or comment on this I'm afraid, other than it being
another reason why Python is ill-suited to realtime work ;-)

Jonathan