Python threading (was: Re: global interpreter lock not working as it should)

Jonathan Hogg jonathan at onegoodidea.com
Mon Aug 5 12:16:19 EDT 2002


On 5/8/2002 16:43, in article
ddc19db7.0208050743.590e56bc at posting.google.com, "Armin Steinhoff"
<a-steinhoff at web.de> wrote:

> I have build three versions of python by inserting a sched_yield and a
> delay of
> 1ms in the code of ceval.c below ... and did run Jonathans testcode.
[...]
> Here are the results:

Is this running on QNX with SCHED_RR realtime scheduling and all-the-same
static priorities?

> case ceval.c unmodified:
> 
>>>> execfile('/root/threads.py')
> Counts:
> [207251, 189529, 228940, 203701, 216320, 169515, 218877, 223871,
> 185256, 212550]
> Total = 2055810
> 
> Switches:
> [85, 85, 83, 84, 84, 84, 84, 83, 84, 86]
> Total = 842
>>>> 
> 
> That means the timeslice exhausted 85 times (or jumped to a higher
> priority) at 'other threads may run now' during 207251 loops! (More
> often than I could imagine ... )
> 
> This leads to an awful bad thread switching performance!

If this is with SCHED_RR, it is indeed more thread switching than I would
have expected. Do you know that the scheduler timeslice is?

> case sched_yield:
> 
> Counts:
> [110635, 110589, 110598, 110597, 110617, 110585, 110600, 110597,
> 110604, 110587]
> Total = 1106009
> 
> Switches:
> [110584, 110549, 110559, 110558, 110589, 110549, 110561, 110553,
> 110562, 110548]
> Total = 1105612
>>>> 
> 
> This says it all ....

This is fairly obvious, but note that the overall performance (number of
iterations in total) has nearly halved. This is the cost of doing thread
switching so often.

> case delay:
> 
> Counts:
> [1438, 1434, 1431, 1428, 1425, 1422, 1419, 1416, 1412, 1409]
> Total = 14234
> 
> Switches:
> [1434, 1434, 1431, 1428, 1425, 1422, 1419, 1416, 1412, 1409]
> Total = 14230
> 
> We have a slow interpreter ... but a good thread switching performance
> :-)

This isn't unsurprising. A delay will have the same effect as an explicit
yield. However, since the delay is so (relatively) long, for much of the
time all 10 threads are asleep.

> My conclusion: insert a sched_yield if the ceval.c code is called from
> a thread.

As I noted, this will trash performance - unless you increase the check
interval substantially.

It's not really an ideal solution all things considered. If you have any
higher-priority I/O going on in other threads then increasing the check
interval will introduce long latencies.

Jonathan




More information about the Python-list mailing list