Knowing which thread had the GIL before

Julien Kauffmann julien.kauffmann at freelan.org
Sat Aug 20 11:50:42 EDT 2016


Hi,

I've worked the past few days on a stastistical sampling profiling tool 
for an online Python 2.7 service that runs on Windows.

My approach so far is:
- I have a dedicated thread that sleeps most of the time and wakes up 
every n-th milliseconds.
- Whenever the thread wakes-up, it gets the current frame/callstack for 
all the active threads (using threading.enumerate()) and increments an 
occurence counter for each currently executed code path.
- When enough time has passed, all the collected samples are saved for 
analysis.

This works somehow, but I fear this technique isn't aligned with how 
threading works in Python where only one thread can run Python code at a 
given time (GIL).

Consider the following case:

- An application has 2 threads running.
- Thread A sleeps for 30 seconds.
- Thread B does some heavy computation (only Python code).

In this case, it is fairly obvious that most of the process execution 
time will be spent in thread B, which will likely have the GIL most of 
the time (since thread A is sleeping). However, when measuring the 
activity with the technique mentionned above, the sleep instruction in 
Thread B appears as the most costly code location, has it is present in 
every profiling sample.

I'd like to modify my approach above by only measuring the activity of 
the last active thread. Obviously, this is not as simple as calling 
`threading.current_thread()` because it will always give me the 
profiling thread which is currently executing.

Is there a way to know which thread had the GIL right before my 
profiling thread acquired it ? Perhaps through a C extension ? I've seen 
such profilers for Linux that take advantage of an ITIMER signal to do 
that. Sadly this is not an option on Windows.

Any feedback or remark concerning my technique is also welcome.

Thanks,

Julien.



More information about the Python-list mailing list