[Python-Dev] Is there any remaining reason why weakref callbacks shouldn't be able to access the referenced object?

Nathaniel Smith njs at pobox.com
Sat Oct 22 14:22:12 EDT 2016


On Sat, Oct 22, 2016 at 3:01 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 22 October 2016 at 16:05, Nathaniel Smith <njs at pobox.com> wrote:
>> On Fri, Oct 21, 2016 at 8:32 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> But PEP 442 already broke all that :-). Now weakref callbacks can
>> happen before __del__, and they can happen on objects that are about
>> to be resurrected.
>
> Right, but the resurrection can still only happen *in* __del__, so the
> interpreter doesn't need to deal with the case where it happens in a
> weakref callback instead - that's where the freedom to do the
> callbacks and the __del__ in either order comes from.

I think we're probably on the same page here, but to be clear, my
point is that right now the resurrection logic seems to be (a) run
some arbitrary Python code (__del__), (b) run a second check to see if
a resurrection occurred (and the details of that check depend on
whether the object is part of a cyclic isolate). Since these two
phases are already decoupled from each other, it shouldn't cause any
particular difficulty for the interpreter if we add weakref callbacks
to the "run arbitrary code" phase. If we wanted to.

>> There remains one obscure corner case where multiple resurrection is
>> possible, because the resurrection-prevention flag doesn't exist on
>> non-GC objects, so you'd still be able to take new weakrefs to those.
>> But in that case __del__ can already do multiple resurrections, and
>> some fellow named Nick Coghlan seemed to think that was okay back in
>> 2013 [1], so probably it's not too bad ;-).
>>
>> [1] https://mail.python.org/pipermail/python-dev/2013-June/126850.html
>
> Right, that still doesn't bother me.
>
>>> Changing that to support resurrecting the object so it can be passed
>>> into the callback without the callback itself holding a strong
>>> reference means losing the main "reasoning about software" benefit
>>> that weakref callbacks offer: they currently can't resurrect the
>>> object they relate to (since they never receive a strong reference to
>>> it), so it nominally doesn't matter if the interpreter calls them
>>> before or after that object has been entirely cleaned up.
>>
>> I guess I'm missing the importance of this -- does the interpreter
>> gain some particular benefit from having flexibility about when to
>> fire weakref callbacks? Obviously it has to pick one in practice.
>
> Sorry, my attempted clarification of one practical implication made it
> look like I was defining the phrase I had in quotes. However, the
> "reasoning about software" benefit I see is "If you don't define
> __del__, you don't need to worry about object resurrection, as it's
> categorically impossible when only using weakref callbacks".
> Interpreter implementors are just one set of beneficiaries of that
> simplification - everyone writing weakref callbacks qualifies as well.

I do like invariants, but I'm having trouble seeing why this one is
super valuable. I mean, if your object doesn't define __del__, then
it's also impossible to distinguish between a weakref causing
resurrection and a strong reference that prevents the object from
being collected in the first place. And certainly it's harmless in the
use case I have in mind, where normally the weakref would be created
in the object's __init__ anyway :-).

> However, if you're happy defining __del__ methods, then PEP 442 means
> you can already inject lazy cyclic cleanup that supports resurrection:
>
>     >>> class Target:
>     ...     pass
>     ...
>     >>> class Resurrector:
>     ...     def __init__(self, target):
>     ...         _self_ref = "_resurrector_{:d}".format(id(self))
>     ...         self.target = target
>     ...         setattr(target, _self_ref, self)
>     ...     def __del__(self):
>     ...         globals()["resurrected"] = self.target
>     ...
>     >>> obj = Target()
>     >>> Resurrector(obj)
>     <__main__.Resurrector object at 0x7f42f8ae34e0>
>     >>> del obj
>     >>> resurrected
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in <module>
>     NameError: name 'resurrected' is not defined
>     >>> import gc
>     >>> gc.collect(); gc.collect(); gc.collect()
>     6
>     4
>     0
>     >>> resurrected
>     <__main__.Target object at 0x7f42f8ae3438>
>
> Given that, I don't see a lot of benefit in making weakref callbacks
> harder to reason about when __del__ + attribute injection already
> makes this possible.

That's a cute trick :-). But it does have one major downside compared
to allowing weakref callbacks to access the object normally. With
weakrefs you don't interfere with when the object is normally
collected, and in particular for objects that aren't part of cycles,
they're still collected promptly (on CPython). Here every object
becomes part of a cycle, so objects that would otherwise be collected
promptly won't be.

(Remember that the reason I started thinking about this was that I was
wondering if we could have a nice API for the asyncio event loop to
"take over" the job of finalizing an object -- so ideally you'd want
this finalizer to act as much like a regular __del__ method as
possible.)

Anyway I doubt we'll see any changes to this in the immediate future,
but it's nice to get a sense of what the possible design landscape
looks like...

-n

-- 
Nathaniel J. Smith -- https://vorpus.org


More information about the Python-Dev mailing list