[issue40522] [subinterpreters] Get the current Python interpreter state from Thread Local Storage (autoTSSkey)

STINNER Victor report at bugs.python.org
Wed Dec 30 06:38:22 EST 2020


STINNER Victor <vstinner at python.org> added the comment:

One GIL per interpreter requires to store the tstate per thread. I don't see any other option. We need to replace the global _PyRuntime atomic variable with a TLS variable. I'm trying to reduce the overhead, but it's heard to beat the performance of an atomic variable.

That's also we I modified many functions to pass explicitly tstate to subfunctions in internal C functions, to avoid any possible overhead of getting tstate.

https://vstinner.github.io/cpython-pass-tstate.html


Pablo:
> In MacOS is quite challenging to activate LTO, so normally optimized builds are only done with PGO.

Oh right, I forgot macOS. I should check how TLS is compiled on macOS. IMO wwo MOV instead of MOV is not a major performance bottleneck.

The best would be to be able to avoid pthread_getspecific() function which is less efficient than a TLS variable. The glibc implementation uses an array for a few variables (first 32 variables?) and then a slower hash table.


Pablo:
> Also in Windows I am not sure is possible to use LTO. Same for many other platforms.

I will check how it's implemented on Windows.

We cannot use TLS on all platforms, since it requires C11 features which are not available on all platforms. Also, the implementation depends on the architecture.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue40522>
_______________________________________


More information about the Python-bugs-list mailing list