[C++-sig] Mysterious triggerings of "UNREF invalid object" in _Py_ForgetReference

Niall Douglas s_sourceforge at nedprod.com
Sun May 27 16:05:44 CEST 2012


Try pinning everything to a single CPU, see what happens.

Try pinning the CPU clock speed to its minimum. If it doesn't trip, 
you have a timing race.

If you can build on Linux, try valgrind.

Oh, you mentioned bits you weren't locking when destroying. I'd lock 
those and see what happens.

Niall

On 26 May 2012 at 13:19, Adam Preble wrote:

> This might be one for the main Python lists but since I have a whole lot of
> stuff wrapped in Boost floating about this problem, I wanted to try the
> C++-sig for starters.  I run my little game experiment for anywhere between
> 15 and 60 seconds, where it's sending a lot of events and messages around
> between my C++ runtime and the Python runtime.  The code that's failing
> often completes hundreds of times without fault.  I don't know when this
> started to happen, but it's something of a recent phenomenon.  Given the
> asynchronous stuff in my code, this could have been latent in it the whole
> time.
> 
> I'm specifically using Stackless Python 2.6.5 with Boost 1.49, with debug
> symbols, so I'll paste _Py_ForgetReference so it's in front of you:
> 
> void
> _Py_ForgetReference(register PyObject *op)
> {
> #ifdef SLOW_UNREF_CHECK
>         register PyObject *p;
> #endif
> if (op->ob_refcnt < 0)
> Py_FatalError("UNREF negative refcnt");
> if (op == &refchain ||
>     op->_ob_prev->_ob_next != op || op->_ob_next->_ob_prev != op)
> Py_FatalError("UNREF invalid object");
> #ifdef SLOW_UNREF_CHECK
> for (p = refchain._ob_next; p != &refchain; p = p->_ob_next) {
> if (p == op)
> break;
> }
> if (p == &refchain) /* Not found */
> Py_FatalError("UNREF unknown object");
> #endif
> op->_ob_next->_ob_prev = op->_ob_prev;
> op->_ob_prev->_ob_next = op->_ob_next;
> op->_ob_next = op->_ob_prev = NULL;
> _Py_INC_TPFREES(op);
> }
> 
> Here's where I get bit:
> 
> if (op == &refchain ||
>     op->_ob_prev->_ob_next != op || op->_ob_next->_ob_prev != op)
> Py_FatalError("UNREF invalid object");
> 
> I am developing in Visual Studio 2010, and I use the immediate window to
> test those logic clauses.
> There are two general situations where it happens:
> 
> 1. A shared pointer to a message I created in the C++ runtime was passed to
> an object in the Python runtime, processed, and the control was returned
> back.  It secured and released the GIL in and out of that Python crossing.
>  On the way out of the original C++ function it naturally decrements the
> shared_ptr use count and starts to destroy it.  That is what I want to
> happen.  Call stack of relevant bits:
> 
>   python26_d.dll!Py_FatalError(const char * msg)  Line 1679 C
> python26_d.dll!_Py_ForgetReference(_object * op)  Line 2178 + 0xa bytes C
>   python26_d.dll!_Py_Dealloc(_object * op)  Line 2197 + 0x9 bytes C
>   wva.exe!boost::python::xdecref<_object>(_object * p)  Line 36 + 0xb3 bytes
> C++
>   wva.exe!boost::python::handle<_object>::reset()  Line 249 + 0xb bytes C++
>   wva.exe!boost::python::converter::shared_ptr_deleter::operator()(const
> void * __formal)  Line 36 C++
>   wva.exe!boost::detail::sp_counted_impl_pd<void
> *,boost::python::converter::shared_ptr_deleter>::dispose()  Line 149 C++
>   wva.exe!boost::detail::sp_counted_base::release()  Line 102 + 0xf bytes
> C++
>   wva.exe!boost::detail::shared_count::~shared_count()  Line 309 C++
> 
> If I probe that if condition I see this:
> op == &refchain
> 0
> op->_ob_prev->_ob_next != op
> 0
> op->_ob_next->_ob_prev != op
> 0
> 
> Nothing was true!  How could that conditional trigger?  All I can suppose
> is a gremlin came in and changed a condition on me.  Something that has
> concerned me is I don't grab the GIL when I deallocate these objects.  I
> don't know how I'd do that.  I have suspected that was a liability for
> awhile, but I'm not entirely sure how.
> 
> 2. Within the same block of C++ code, at the point that I'm trying to
> transmit the message to the Python-derived object, it'll puke too.  So here
> it has already created a shared_ptr for the message and is triggering a
> callback into the Python code to handle it.  The Python-derived object is
> being called through a wrapper, and that call has the GIL.  This stack
> trace is much more obnoxious and it's difficult for me to make any sense of
> it.  Note there's some Stackless stuff in there.  I think the interpreter
> is at least starting to call some of the code in the Python derivation, but
> I haven't been able to figure out how far it gets.  I'll take any advice on
> how to probe this stuff since I feel I am too vague here--look for square
> brackets on a few lines for some things I figured out:
> 
>   python26_d.dll!Py_FatalError(const char * msg)  Line 1679 C
> python26_d.dll!_Py_ForgetReference(_object * op)  Line 2178 C
>   python26_d.dll!_Py_Dealloc(_object * op)  Line 2197 + 0x9 bytes C
>   python26_d.dll!tupledealloc(PyTupleObject * op)  Line 170 + 0x86 bytes C
>   python26_d.dll!_Py_Dealloc(_object * op)  Line 2198 + 0x7 bytes C
>   python26_d.dll!PyObject_CallFunctionObjArgs(_object * callable, ...)
>  Line 2751 + 0x54 bytes C
>   python26_d.dll!handle_callback(_PyWeakReference * ref, _object *
> callback)  Line 881 + 0xf bytes C
>   python26_d.dll!PyObject_ClearWeakRefs(_object * object)  Line 928 + 0xd
> bytes C
>   wva.exe!instance_dealloc(_object * inst)  Line 344 + 0xa bytes C++
>  [This is Boost.Python class.cpp, statically linked]
>   python26_d.dll!subtype_dealloc(_object * self)  Line 1020 + 0x7 bytes C
>   python26_d.dll!_Py_Dealloc(_object * op)  Line 2198 + 0x7 bytes C
>        [I know here it's deallocating a wrapped type for a 3d vector I was
> passing around]
>   python26_d.dll!insertdict(_dictobject * mp, _object * key, long hash,
> _object * value)  Line 459 + 0x54 bytes C      [It has replaced an existing
> 3d vector with the passed one, and trying to nuke the old one]
>   python26_d.dll!PyDict_SetItem(_object * op, _object * key, _object *
> value)  Line 701 + 0x15 bytes C
>   python26_d.dll!PyObject_GenericSetAttr(_object * obj, _object * name,
> _object * value)  Line 1504 + 0x11 bytes C
>   python26_d.dll!PyObject_SetAttr(_object * v, _object * name, _object *
> value)  Line 1247 + 0x14 bytes C      [value is my 3d vector I am passing
> around]
>   python26_d.dll!PyEval_EvalFrame_value(_frame * f, int throwflag, _object
> * retval)  Line 2063 C
>   python26_d.dll!PyEval_EvalFrameEx_slp(_frame * f, int throwflag, _object
> * retval)  Line 836 + 0x15 bytes C
>   python26_d.dll!slp_frame_dispatch_top(_object * retval)  Line 719 + 0x12
> bytes C
>   python26_d.dll!slp_run_tasklet()  Line 1204 + 0x9 bytes C
>   python26_d.dll!slp_eval_frame(_frame * f)  Line 299 + 0x5 bytes C
>   python26_d.dll!climb_stack_and_eval_frame(_frame * f)  Line 266 + 0x9
> bytes C
>   python26_d.dll!slp_eval_frame(_frame * f)  Line 294 + 0x9 bytes C
>   python26_d.dll!PyEval_EvalCodeEx(PyCodeObject * co, _object * globals,
> _object * locals, _object * * args, int argcount, _object * * kws, int
> kwcount, _object * * defs, int defcount, _object * closure)  Line 3294 +
> 0x6 bytes C
>   python26_d.dll!function_call(_object * func, _object * arg, _object * kw)
>  Line 540 + 0x40 bytes C
>   python26_d.dll!PyObject_Call(_object * func, _object * arg, _object * kw)
>  Line 2502 + 0x3c bytes C
>   python26_d.dll!instancemethod_call(_object * func, _object * arg, _object
> * kw)  Line 2586 + 0x11 bytes C
>   python26_d.dll!PyObject_Call(_object * func, _object * arg, _object * kw)
>  Line 2502 + 0x3c bytes C
>   python26_d.dll!PyEval_CallObjectWithKeywords(_object * func, _object *
> arg, _object * kw)  Line 3931 + 0x11 bytes C
>   python26_d.dll!PyEval_CallFunction(_object * obj, const char * format,
> ...)  Line 556 + 0xf bytes C
>   wva.exe!boost::python::override::operator()<boost::shared_ptr<game::IComponentCommunicatable>,unsigned
> int,boost::shared_ptr<game::ComponentMessage> >(const
> boost::shared_ptr<game::IComponentCommunicatable> & a0, const unsigned int
> & a1, const boost::shared_ptr<game::ComponentMessage> & a2)  Line 138 +
> 0xac bytes C++
>   wva.exe!game::ComponentWrapper::IncomingSignalEvent(boost::shared_ptr<game::IComponentCommunicatable>
> source, unsigned int id, boost::shared_ptr<game::ComponentMessage> message)
>  Line 234 + 0x4b bytes C++
> 
> At least this time the logic condition is true...
> op == &refchain
> 0
> op->_ob_prev->_ob_next != op
> 1
> op->_ob_next->_ob_prev != op
> 0
> 
> What it's trying to free is of type __PyWeakref_RefType.
> 
> The 3d vector wrapper has a lot of methods, but I think of particular
> interest would be the class block.  It looks like
> this: class_<vector3df>("vector3df", init<float, float, float>())
> Stuff in Python can create them and they can also get passed around from
> C++ to Python and back.
> 
> The impression I get is that something is getting freed before its time.  I
> couldn't tell if it's the shared_ptr self-destructing or the Python GC
> jumping the gun.
> 
> I don't have a good impression is what is wrong so I'm finding it hard to
> write a simplified, self-contained example for the list.  I didn't expect a
> silver bullet with this first message, but I figured somebody had enough
> experience that I could start isolating things and pare it down.
> 


-- 
Technology & Consulting Services - ned Productions Limited.
http://www.nedproductions.biz/. VAT reg: IE 9708311Q.
Work Portfolio: http://careers.stackoverflow.com/nialldouglas/





More information about the Cplusplus-sig mailing list