Please help with Threading

Sun May 19 21:04:42 EDT 2013

On 05/19/2013 05:46 PM, Dennis Lee Bieber wrote:
> On Sun, 19 May 2013 10:38:14 +1000, Chris Angelico <rosuav at gmail.com>
> declaimed the following in gmane.comp.python.general:
>
>> On Sun, May 19, 2013 at 10:02 AM, Carlos Nepomuceno
>> <carlosnepomuceno at outlook.com> wrote:
>>> I didn't know Python threads aren't preemptive. Seems to be something really old considering the state of the art on parallel execution on multi-cores.
>>>
>>> What's the catch on making Python threads preemptive? Are there any ongoing projects to make that?
>>
> 	<snip>
>
>> With interpreted code eg in CPython, it's easy to implement preemption
>> in the interpreter. I don't know how it's actually done, but one easy
>> implementation would be "every N bytecode instructions, context
>> switch". It's still done at a lower level than user code (N bytecode
>
> 	Which IS how the common Python interpreter does it -- barring the
> thread making some system call that triggers a preemption ahead of time
> (even time.sleep(0.0) triggers scheduling). Forget if the default is 20
> or 100 byte-code instructions -- as I recall, it DID change a few
> versions back.
>
> 	Part of the context switch is to transfer the GIL from the preempted
> thread to the new thread.
>
> 	So, overall, on a SINGLE CORE processor running multiple CPU bound
> threads takes a bit longer just due to the overhead of thread swapping.
>
> 	On a multi-core processor, the effect is the same, since -- even
> though one may have a thread running on each core -- the GIL is only
> assigned to one thread, and other threads get blocked when trying to
> access runtime data structures. And you may have even more overhead from
> processor cache misses if the a thread gets assigned to a different
> core.
>
> 	(yes -- I'm restating the same thing as I had just trimmed below
> this point... but the target is really the OP, where repetition may be
> helpful in understanding)
>

So what's the mapping between real (OS) threads, and the fake ones 
Python uses?  The OS keeps track of a separate stack and context for 
each thread it knows about;  are they one-to-one with the ones you're 
describing here?  If so, then any OS thread that gets scheduled will 
almost always find it can't get the GIL, and spend time thrashing.   But 
the change that CPython does intentionally would be equivalent to a 
sleep(0).

On the other hand, if these threads are distinct from the OS threads, is 
it done with some sort of thread pool, where CPython has its own stack, 
and doesn't really use the one managed by the OS?

Understand the only OS threading I really understand is the one in 
Windows (which I no longer use).  So assuming Linux has some form of 
lightweight threading, the distinction above may not map very well.

-- 
DaveA