[C++-sig] GIL problems during destruction of a boost::python::wrapper

Mon May 7 20:28:23 CEST 2012

On 7 May 2012 at 10:30, Adam Preble wrote:

> I want to make sure I understand the repercussions.
> 
> I understand if I were introducing C++ objects into the Python runtime as
> internal references that I would be inviting disaster if I delete them on
> the C++ side, but continue to use them on the Python side.  I don't think
> either of us were talking about that but I wanted to make sure I understood
> the boundaries here.

What I'm saying is that the wrapper is one object, and the object 
being wrapped is another, totally separate object. Python can delete 
the wrapper at any time, and C++ can delete the object being wrapped 
at any time. That keeps both happy. For very obvious reasons, if you 
delete the object being wrapped you want to reset the smart pointer 
in the wrapper to zero i.e. zombiefy it. I believe BPL knows to throw 
a python exception if the smart pointer in the wrapper is null.

> Now on to shared pointer references.  I am utilizing shared pointers for
> this particular circumstance.  The object was created in Python and went
> out of scope.  If it weren't for the shared pointer, that would be the end
> of it, but the pointer was passed to my C++ runtime and retained in a data
> structure for future work.  This kept it alive--as I had desired.  When the
> object eventually leaves that structure and the refcount drops to zero, it
> moves on to destruction.

If you want C++ code to have a say in the lifetime of a python 
object, simply hold a ref to it. BPL will decrease the refcount when 
the ref gets destructed, thus destroying the object if that's the 
right thing to do.

Generally, though, you DON'T want to keep python wrappers of C++ 
objects around manually [1]. You keep the C++ object around only. If 
python needs to pythonify it, you should set that up to happen on 
demand.

[1]: The obvious exception is when a refcount toggles between zero 
and one through a call stack, thus causing lots of constructions and 
destructions of python wrapper objects. Here it's wise to manually 
hack the refcount.

> It does look to me like Python is trying to take care of it since I
> immediately pile up through a bunch of Python runtime functions before I
> eventually hit my favorite "no such thread" GIL error in the runtime.
> 
> I'm not so sure what to do but I can try to search the distribution based
> on what you said in hopes of getting some specifics.  For the time being, I
> thought it was a deterministic problem, but like most asynchronous stuff,
> it went away the next day on a fresh boot with the code slightly altered.
>  What I had done was written an empty destructor for the wrapper, just in
> anticipating of filling it in with something here.  I can't imagine that
> fixing the problem--I wouldn't know why it would fix it.  I think I'll try
> to strip that code out and make it come back.

An empty destructor inhibits the default constructor. If you're on an 
older MSVC, I vaguely remember a bug where MSVC failed to call a 
default constructor in some circumstances and a quick way of fixing 
it was to write out the default constructor manually.

> Meanwhile, I'm trying to work on farming all Python work out to a dedicated
> thread, and have all these wrappers just inject commands into a stack on
> it.  It looks like if I can keep everything bound to there I won't have
> issues like this <knocks on wood>.

As BPL is currently designed, generally you only want to use it from 
a single thread and indeed just once in any python interpreter. If 
your C++ is inherently multi-threaded though, it can be painful 
serialising everything e.g. use of threading to implement async i/o.

The GIL isn't really a lock, it's actually the per-thread setting of 
what the current interpreter is. What confuses people is that 
sometimes you must set the current interpreter but with the lock 
unlocked for certain functionality to work right. Then people overdo 
it and turn the lock on too frequently, thus introducing deadlocks, 
or people underdo it and you get GIL not set. You can have the "GIL" 
set in the sense of the current interpreter set, but the lock being 
held by a different thread. If you ever try interworking when each 
thread has its own interpreter you'll become a dab hand at this sort 
of stuff.

I know it's frustrating, and there is very little documentation on 
this. I can promise you that if you keep at it, one day it clicks and 
it all starts to work very nicely. Hopefully, sometime this year I'll 
get the funding in place to implement a proper Boost generic type 
registry and things like manual GIL and interpreter management can go 
the way of the dodo.

Niall

-- 
Technology & Consulting Services - ned Productions Limited.
http://www.nedproductions.biz/. VAT reg: IE 9708311Q.
Work Portfolio: http://careers.stackoverflow.com/nialldouglas/