[Python-Dev] Py_CLEAR to avoid crashes
Daniel Stutzbach
daniel at stutzbachenterprises.com
Tue Feb 19 01:13:24 CET 2008
On Feb 18, 2008 4:52 PM, Neil Schemenauer <nas at arctrix.com> wrote:
> That sucks. Most Py_DECREF calls are probably okay but it's going
> to be hard to find the ones that are not. I can't think of anything
> we can do to make this trap harder to fall into. Even using
> Py_CLEAR as a blunt tool is not a total solution. You could still
> end up with a null pointer dereference if the code is not written
> carefully.
>
Container types (particularly lists) go through great lengths to postpone
object deletion. For example, to delete a slice from a list all of the
items must be copied to a temporary array, then the list object's pointers
are modified, then all the Py_DECREF's are called just before returning.
I have always seen this as a robustness versus efficiency issue. It's
theoretically possible to set things up so that reference counter decrements
are actually postponed until after the C method/slot returns, but it's
slower than doing it immediately. I wonder if adding support for postponed
decrements (without making it mandatory) would at least make the trap harder
to fall into.
For example:
- maintain a global array of pending decrefs
- before calling into any C method/slot, save the index of the current
end-of-array (in a local C variable on the stack)
- call the C method, which may call Py_DECREF_LATER(x) to append x to the
global array
- when the C method returns, decref anything newly appended to the array
The array would grow and shrink just as a list does (O(1) amortized time to
add/remove a pointer).
This would simplify a number of places in listobject.c as well as remove the
need for Py_TRASHCAN_*. It would be entirely optional, so anyone who is
very careful and wants the speed of Py_DECREF can have it. Also, the
deferment is very brief, since the decrefs occur right after the C method
returns.
The downside is having to store and check the global array length on every C
method call (basically 3 machine instructions). The machine instructions
aren't so bad, but I'm not sure about the effects on the CPU cache.
So, like I said, a robustness versus performance trade-off. :-(
--
Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises LLC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20080218/37575de3/attachment.htm
More information about the Python-Dev
mailing list