[Python-Dev] PyWeakref_GetObject() borrows its reference from... whom?

Tue Oct 11 07:17:28 EDT 2016

> Huh?  In all other circumstances, a "borrowed" reference is exactly that:
X has a reference, and you are relying on X's reference to keep the object
alive.  Borrowing from a borrowed reference is simply a chain of these; Z
borrows from Y, Y borrows from X, and X is the original person who did the
incref.  But you're borrowing from something specific, somebody who the API
guarantees has a legitimate reference on the object and won't drop it while
you're using it.  I bet for every other spot in the API I can tell you from
whom you're borrowing the reference.
>
> In contrast, the "borrowed" reference returned by PyWeakRef_GetObject()
seems to be "borrowed" from some unspecified entity.  The fact that the
object is live in Python directly implies that, yes, *somebody* must have a
reference, somewhere.  But ISTM (and apparently you) that this is relying
on the GIL preventing that unknown other actor from dropping their
reference while you've borrow it.  A guarantee that the post-Gilectomy
Python interpreter can no longer make!

Let me try again. The only rule for borrowing: x (borrowed reference) is
only guaranteed to be alive for as long as y (source) is guaranteed to be
alive. At least if you phrase it carefully enough, weakrefs still fit the
bill -- your borrowed reference is alive for as long as the weakref is
alive. Of course, the lifetime for a weakref is completely undefined in
GILectomized Python, and barely controllable even in regular CPython. So
this is not a great API.

At the very least we can contort definitions into making it plausible that
we borrowed from the weakref.  If we can only analyze code through the lens
of borrowing, we have no choice except to look at it this way.

> In any case, I see nothing in the documentation that suggests "borrowed
only means unowned" as you suggest.  In contrast, the documentation seems
to suggest that the metaphor is how I understood it; that when you "borrow"
a reference, there is another object who has a reference and you're relying
on their reference to keep the object alive.

This is its own big tangent. Sorry,

Your link is all I have too. It doesn't spell it out. AFAIK there are
exactly two kinds of references discussed anywhere in the docs: owned
references, where you are obligated to call Py_DECREF, and everything else.
Python exclusively uses the term "borrowed references" for that "everything
else". I don't know why. It's a good way to encourage reasonable practices,
I guess.

As the docs themselves note, this metaphor is useful but flawed: e.g. you
can "borrow" the same thing and the original is still usable. But I'd go in
another direction -- the metaphor is sufficient to show safety of some
code, but some safe code exists where we can't as easily use "borrowing" to
talk about why it's safe, and might need to introduce new concepts or
switch tactics.

PyObject* x = PyList_New(0); // x is a new owned reference.
PyObject* y = x;  // y is a borrowed reference.
PyObject* z = x;  // z is also a borrowed reference.
Py_INCREF(y);  // y has been upgraded to an owned reference.
Py_CLEAR(x);  // the original reference z borrowed from disappeared.
// This is provably safe, but you can't use the "borrowing" metaphor for
that proof without introducing new concepts.

do_stuff(z);

// You might wonder, "why would anyone ever do that?!?"
// One reason is because some functions "steal" references, so you need to
borrow before handing off ownership:

// y is an owned reference.
my_struct->foo = y // borrowed a reference to y.
PyTuple_SetItem(my_strict->some_tuple, 0, y); // stole y.
// Now, in a super-strict interpretation, y is no longer "valid", and the
original borrowing relationship has been broken.
// We should ideally reason "as if" it borrowed from my_struct->some_tuple,
even though it didn't.
// (Obviously, this example is still a little overcomplicated, but you get
how this might happen IRL, yeah?
//  e.g. maybe z already existed and the API stealing a reference doesn't
have a GET_ITEM macro.)
// [Note that this has a similar problem to weakref: another thread could
mutate the tuple and delete the object. Yikes!]

I hope those examples made sense.

weakref is playing with fire in a similar way: PyWeakref_GetObject is safe
because someone still exists, and there is a lock that guarantees they
exist as long as you don't release that lock and don't run any code that
might delete a reference.  (In particular, it's guaranteed safe to
immediately upgrade the borrowed reference -- or is without gilectomy.)
 Unlike the stealing tuple example, it isn't clear where the owned
reference is.

So what I mean by saying it's a red herring, is that the correctness of
code doesn't hinge on how easy it is to apply the concept of borrowing, but
exclusively on lifetimes and whether your borrowed reference can be proven
to lie within the object lifetime. If it could not at all be explained
sensibly as a "borrow" from anyone, it can still be right -- that would
only make it confusing, and dangerous.  (But that's a foregone conclusion
at this point.)

-- Devin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20161011/fb6cab7a/attachment-0001.html>