Can someone explain this weakref behavior?
David MacQuigg
dmq at gain.com
Mon Jun 14 19:25:57 EDT 2004
On Fri, 11 Jun 2004 18:09:32 -0400, "Tim Peters" <tim.one at comcast.net>
wrote:
>[David MacQuigg, trying to keep track of how many instances of a class
> currently exist]
>
>...
>
>> Seems like we could do this more easily with a function that lists
>> instances, like __subclasses__() does with subclasses. This doesn't have
>> to be efficient, just reliable. So when I call cls.__instances__(), I
>> get a current list of all instances in the class.
>>
>> Maybe we could implement this function using weak references. If I
>> understand the problem with weak references, we could have a
>> WeakValueDictionary with references to objects that actually have a
>> refcount of zero.
>
>Not in CPython today (and in the presence of cycles, the refcount on an
>object isn't related to whether it's garbage).
>
>> There may be too many entries in the dictionary, but never too few.
>
>Right!
>
>> In that case, maybe I could just loop over every item in
>> my WeakValueDictionary, and ignore any with a refcount of zero.
>>
>> def _getInstances(cls):
>> d1 = cls.__dict__.get('_instances' , {})
>> d2 = {}
>> for key in d1:
>> if sys.getrefcount(d1[key]) > 0:
>> d2[key] = d1[key]
>> return d2
>> _getInstances = staticmethod(_getInstances)
>>
>> I'm making some assumptions here that may not be valid, like
>> sys.getrefcount() for a particular object really will be zero immediately
>> after all normal references to it are gone. i.e. we don't have any
>> temporary "out-of-sync" problems like with the weak references
>> themselves.
>>
>> Does this seem like a safe strategy?
>
>An implementation of Python that doesn't base its garbage collection
>strategy on reference counting won't *have* a getrefcount() function, so if
>you're trying to guard against Python switching gc strategies, this is a
>non-starter (it solves the problem for, and only for, implementations of
>Python that don't have the problem to begin with <wink>).
>
>Note that CPython's getrefcount() can't return 0 (see the docs). Maybe
>comparing against 1 would capture your intent.
>
>Note this part of the weakref docs:
>
> NOTE: Caution: Because a WeakValueDictionary is built on top of a Python
> dictionary, it must not change size when iterating over it. This can be
> difficult to ensure for a WeakValueDictionary because actions performed by
> the program during iteration may cause items in the dictionary to vanish
> "by magic" (as a side effect of garbage collection).
>
>If you have threads too, it can be worse than just that.
>
>Bottom line: if you want semantics that depend on the implementation using
>refcounts, you can't worm around that. Refcounts are the only way to know
>"right away" when an object has become trash, and even that doesn't work in
>the presence of cycles. Short of that, you can settle for an upper bound on
>the # of objects "really still alive" across implementations by using weak
>dicts, and you can increase the likely precision of that upper bound by
>forcing a run of garbage collection immediately before asking for the
>number. In the absence of cycles, none of that is necessary in CPython
>today (or likely ever).
>
>Using a "decrement count in a __del__" approach isn't better: only a
>reference-counting based implementation can guarantee to trigger __del__
>methods as soon as an object (not involved in a cycle) becomes unreachable.
>Under any other implementation, you'll still just get an upper bound.
>
>Note that all garbage collection methods are approximations to true
>lifetimes anyway. Even refcounting in the absence of cycles: just because
>the refcount on an object is 10 doesn't mean that any of the 10 ways to
>reach the object *will* get used again. An object may in reality be dead as
>a doorknob no matter how high its refcount. Refcounting is a conservative
>approximation too (it can call things "live" that will in fact never be used
>again, but won't call things "dead" that will in fact be used again).
Thank you for this very thorough answer to my questions. I have a
much better understanding of the limitations of weakrefs now. I also
see my suggestion of using sys.getrefcount() suffers from the same
limitations. I've decided to leave my code as is, but put some
prominent warnings in the documentation:
'''
[Note 1] As always, there are limitations. Nothing is ever absolute
when it comes to reliability. In this case we are depending on the
Python interpreter to immediately delete a weak reference when the
normal reference count goes to zero. This depends on the
implementation details of the interpreter, and is *not* guaranteed by
the language. Currently, it works in CPython, but not in JPython.
For further discussion, see the post under {"... weakref behavior" by
Tim Peters in comp.lang.python, 6/11/04}.
-- One other limitation - if there is any possibility of an instance
you are tracking with a weakref being included in a cycle (a group of
objects that reference each other, but have no references from
anything outside the group), then this scheme won't work. Cyclic
garbage remains in memory until a special garbage collector gets
around to sniffing it out.
'''
You might want to put something similar in the Library Reference.
-- Dave
More information about the Python-list
mailing list