global interpreter lock not working as it should

anton wilson anton.wilson at camotion.com
Mon Aug 5 19:45:18 EDT 2002


On Monday 05 August 2002 06:10 pm, anton wilson wrote:
> > >This is true. Therefore, the only time another thread WILL grab the GIL
> > > under Linux is if
> > >
> > >1) the GIL is released by the currently running thread
> > >2) the thread that just released the GIL depletes its timeslice before
> > > it can grab the lock again
> > >3) the OS notices the process has depleted it's timeslice and the yanks
> > >   it from the CPU (this happens every 100 times per second by default
> > > on an i386)
> > >4) the waiting thread that recieved the GIL release signal is chosen to
> > > run
> > >
> > >Therefore, we now have a large set of coincidences for CPU-bound python
> > >threads. The only reason it works at all is because it happens 100 times
> > > per second and the GIL is released frequently by default. So there is a
> > > sufficient probability that these 4 cases will happen simultaneously.
> >
> > Hm. If that's an accurate description (and I am skeptical ;-)
>
> The reason I came to this conclusion because of several things.
>
> 1) a thread waiting on the wake-up condition from within the GIL acquire
> function really just calls a sys_sigsuspend system call.
>
>    What then happens is that it sets its status to TASK_INTERRUPTIBLE and
>   calls the OS schedule function. This in turn notices that the current
> task is TASK_INTERRUPTIBLE. It checks to see if any signals are pending. If
> not, the OS removes the task from the run-queue.
>
> 2) Once the thread is removed from the run-queue, only a signal delivery
> will push it back onto the run-queue. Sounds good, but a schedule call is
> only triggered under these circumstances (Linux 2.4.19 + O(1)):
>
>        if (p->prio < rq->curr->prio || rq->curr->policy == SCHED_BATCH)
>                         resched_task(rq->curr);
>

I should also clarify that in vanilla linux this comparison is not even made. 
On a UP system, there will be no schedule call.


> If the current process is a batch or if the current process has a lower
> priority than the woken up process. So, in our case, the priority would
> need to be lower to force a reschedule.
>
> Normal sched_other processes get a boost for sleeping when they are put
> back on the run_queue with this code:
>
>         unsigned long sleep_time = jiffies - p->sleep_timestamp;
>         p->sleep_avg += sleep_time;
>
> The timer tick happens at a frequency of 100 Hz and increases jiffies by
> one. Therefore, sleep_avg will only be less than jiffies every 10
> millisecs. Python tries to give up the interpreter every 10 bytecodes which
> is at a much higher frequency: 7 - 15 microseconds on a regular case from
> my tests (could be higher). Therefore, you see that python will release and
> grab the GIL in worse case 1000 times before a reschedule is called due to
> a signal delivery, allowing the woken up thread to grab the GIL.
>
>
> Anton




More information about the Python-list mailing list