Why GIL? (was Re: what's the point of rpython?)

Fri Jan 23 00:38:37 EST 2009

On Jan 22, 9:38 pm, Carl Banks <pavlovevide... at gmail.com> wrote:
> On Jan 22, 6:00 am, a... at pythoncraft.com (Aahz) wrote:
>
>
>
> > In article <7xd4ele060.... at ruckus.brouhaha.com>,
> > Paul Rubin  <http://phr...@NOSPAM.invalid> wrote:
>
> > >alex23 <wuwe... at gmail.com> writes:
>
> > >> Here's an article by Guido talking about the last attempt to remove
> > >> the GIL and the performance issues that arose:
>
> > >> "I'd welcome a set of patches into Py3k *only if* the performance for
> > >> a single-threaded program (and for a multi-threaded but I/O-bound
> > >> program) *does not decrease*."
>
> > >The performance decrease is an artifact of CPython's rather primitive
> > >storage management (reference counts in every object).  This is
> > >pervasive and can't really be removed.  But a new implementation
> > >(e.g. PyPy) can and should have a real garbage collector that doesn't
> > >suffer from such effects.
>
> > CPython's "primitive" storage management has a lot to do with the
> > simplicity of interfacing CPython with external libraries.  Any solution
> > that proposes to get rid of the GIL needs to address that.
>
> I recently was on a long road trip, and was not driver, and with
> nothing better to do thought quite a bit about how this.
>
> I concluded that, aside from one major trap, it wouldn't really be
> more difficult to inteface Python to external libraries, just
> differently difficult.  Here is briefly what I came up with:
>
> 1. Change the singular Python type into three metatypes:
> immutable_type, mutable_type, and mutable_dict_type.  (In the latter
> case, the object itself is immutable but the dict can be modified.
> This, of course, would be the default metaclass in Python.)  Only
> mutable_types would require a mutex when accessing.
>
> 2. API wouldn't have to change much.  All regular API would assume
> that objects are unlocked (if mutable) and in a consistent state.
> It'll lock any mutable objects it needs to access.  There would also
> be a low-level API that assumes the objects are locked (if mutable)
> and does not require objects to be consistent.  I imagine most
> extensions would call the standard API most of the time.
>
> 3. If you are going to use the low-level API on a mutable object, or
> are going to access the object structure directly, you need to acquire
> the object's mutex. Macros such as Py_LOCK(), Py_LOCK2(), Py_UNLOCK()
> would be provided.
>
> 4. Objects would have to define a method, to be called by the GC, that
> marks every object it references.  This would be a lot like the
> current tp_visit, except it has to be defined for any object that
> references another object, not just objects that can participate in
> cycles.  (A conservative garbage collector wouldn't suffice for Python
> because Python quite often allocates blocks but sets the pointer to an
> offset within the block.  In fact, that's true of almost any Python-
> defined type.)  Unfortunately, references on the stack would need to
> be registered as well, so "PyObject* p;" might have to be replaced
> with something like "Py_DECLARE_REF(PyObject,p);" which magically
> registers it.  Ugly.
>
> 5. Py_INCREF and Py_DECREF are gone.
>
> 6. GIL is gone.
>
> So, you gain the complexity of a two-level API, having to lock mutable
> objects sometimes, and defining more visitor methods than before, but
> you don't have to keep INCREFs and DECREFs straight, which is no small
> thing.
>
> The major trap is the possibily of deadlock.  To help minimize the
> risk there would be macros to lock multiple objects at once.  Py_LOCK2
> (a,b), which guarantess that if in another thread is calling Py_LOCK2
> (b,a) at the same time, it won't result in a deadlock.  What's
> disappointing is that the deadlocking possibility is always with you,
> much like the reference counts are.

IMO, locking of the object is a secondary problem.  Python-safethread
provides one solution, but it's not the only conceivable one.  For the
sake of discussion it's easier to assume somebody else is solving it
for you.

Instead, focus on just the garbage collection.  What are the practical
issues of modifying CPython to use a tracing GC throughout?  It
certainly is possible to write an exact GC in C, but the stack
manipulation would be hideous.  It'd also require significant rewrites
of the entire code base.  Throw on that the performance is unclear (it
could be far worse for a single-threaded program), with no
straightforward way to make it a compile-time option..

Got any ideas for that?