[Python-Dev] baby steps for free-threading

Guido van Rossum guido@python.org
Tue, 18 Apr 2000 14:25:11 -0400


> A couple months ago, I exchanged a few emails with Guido about doing the
> free-threading work. In particular, for the 1.6 release. At that point
> (and now), I said that I wouldn't be starting on it until this summer,
> which means it would miss the 1.6 release. However, there are some items
> that could go into 1.6 *today* that would make it easier down the road to
> add free-threading to Python. I said that I'd post those in the hope that
> somebody might want to look at developing the necessary patches. It fell
> off my plate, so I'm getting back to that now...
> 
> Python needs a number of basic things to support free threading. None of
> these should impact its performance or reliability. For the most part,
> they just provide a platform for the later addition.

I agree with the general design sketched below.

> 1) Create a portable abstraction for using the platform's per-thread state
>    mechanism. On Win32, this is TLS. On pthreads, this is pthread_key_*.

There are at least 7 other platform specific thread implementations --
probably an 8th for the Mac.  These all need to support this.  (One
solution would be to have a portable implementation that uses the
thread-ID to index an array.)

>    This mechanism will be used to store PyThreadState structure pointers,
>    rather than _PyThreadState_Current. The latter variable must go away.
> 
>    Rationale: two threads will be operating simultaneously. An inherent
>    conflict arises if _PyThreadState_Current is used. The TLS-like
>    mechanism is used by the threads to look up "their" state.
> 
>    There will be a ripple effect on PyThreadState_Swap(); dunno offhand
>    what. It may become empty.

Cool.

> 2) Python needs a lightweight, short-duration, internally-used critical
>    section type. The current lock type is used at the Python level and
>    internally. For internal operations, it is rather heavyweight, has
>    unnecessary semantics, and is slower than a plain crit section.
> 
>    Specifically, I'm looking at Win32's CRITICAL_SECTION and pthread's
>    mutex type. A spinlock mechanism would be coolness.
> 
>    Rationale: Python needs critical sections to protect data from being
>    trashed by multiple, simultaneous access. These crit sections need to
>    be as fast as possible since they'll execute at all key points where
>    data is manipulated.

Agreed.

> 3) Python needs an atomic increment/decrement (internal) operation.
> 
>    Rationale: these are used in INCREF/DECREF to correctly increment or
>    decrement the refcount in the face of multiple threads trying to do
>    this.
> 
>    Win32: InterlockedIncrement/Decrement. pthreads would use the
>    lightweight crit section above (on every INC/DEC!!). Some other
>    platforms may have specific capabilities to keep this fast. Note that
>    platforms (outside of their threading libraries) may have functions to
>    do this.

I'm worried here that since INCREF/DECREF are used so much this will
slow down significantly, especially on platforms that don't have safe
hardware instructions for this.  So it should only be enabled when
free threading is turned on.

> 4) Python's configuration system needs to be updated to include a
>    --with-free-thread option since this will not be enabled by default.
>    Related changes to acconfig.h would be needed. Compiling in the above
>    pieces based on the flag would be nice (although Python could switch to
>    the crit section in some cases where it uses the heavy lock today)
> 
>    Rationale: duh

Maybe there should be more fine-grained choices?  As you say, some
stuff could be used without this flag.  But in any case this is
trivial to add.

> 5) An analysis of Python's globals needs to be performed. Any global that
>    can safely be made "const" should. If a global is write-once (such as
>    classobject.c::getattrstr), then these are marginally okay (there is a 
>    race condition, with an acceptable outcome, but a mem leak occurs).
>    Personally, I would prefer a general mechanism in Python for creating
>    "constants" which can be tracked by the runtime and freed.

They are almost all string constants, right?  How about a macro
Py_CONSTSTROBJ("value", variable)?

>    I would also like to see a generalized "object pool" mechanism be built
>    and used for tuples, ints, floats, frames, etc.

Careful though -- generalizing this will slow it down.  (Here I find
myself almost wishing for C++ templates :-)

>    Rationale: any globals which are mutable must be made thread-safe. The
>    fewer non-const globals to examine, the fewer to analyze for race
>    conditions and thread-safety requirements.
> 
>    Note: making some globals "const" has a ripple effect through Python.
>    This is sometimes known as "const poisoning". Guido has stated an
>    acceptance to adding "const" throughout the interpreter, but would
>    prefer a complete (rather than ripple-based, partial) overhaul.

Actually, it's okay to do this on an "as-neeed" basis.  I'm also in
favor of changing all the K&R code to ANSI, and getting rid of
Py_PROTO and friends.  Cleaner code!

> I think that is all for now. Achieving these five steps within the 1.6
> timeframe means that the free-threading patches will be *much* smaller. It
> also creates much more visibility and testing for these sections.

Alas.  Given the timeframe for 1.6 (6 weeks!), the need for thorough
testing of some of these changes, the extensive nature of some of the
changes, and my other obligations during those 6 weeks, I don't see
how it can be done for 1.6.  I would prefer to do an accellerated 1.7
or 1.6.1 release that incorporates all this.  (It could be called
1.6.1 only if it'nearly identical to 1.6 for the Python user and not
too different for the extension writer.)

> Post 1.6, a patch set to add critical sections to lists and dicts would be
> built. In addition, a new analysis would be done to examine the globals
> that are available along with possible race conditions in other mutable
> types and structures. Not all structures will be made thread-safe; for
> example, frame objects are used by a single thread at a time (I'm sure
> somebody could find a way to have multiple threads use or look at them,
> but that person can take a leap, too :-)

It is unacceptable to have thread-unsafe structures that can be
accessed in a thread-unsafe way using pure Python code only.

> Depending upon Guido's desire, the various schedules, and how well the
> development goes, Python 1.6.1 could incorporate the free-threading option
> in the base distribution.

Indeed.

--Guido van Rossum (home page: http://www.python.org/~guido/)