[Python-Dev] Quick-and-dirty weak references

M.-A. Lemburg mal@lemburg.com
Tue, 17 Aug 1999 10:50:01 +0200


Tim Peters wrote:
> 
> [about weakdicts and the possibility of building them on weak
>  references; the obvious way doesn't clean up the dict itself by
>  magic; maybe a weak object should be notified when its referent
>  goes away
> ]
> 
> [M.-A. Lemburg]
> > Perhaps one could fiddle something out of the Proxy objects
> > in mxProxy (you know where...). These support a special __cleanup__
> > protocol that I use a lot to work around circular garbage:
> > the __cleanup__ method of the referenced object is called prior
> > to destroying the proxy; even if the reference count on the
> > object has not yet gone down to 0.
> >
> > This makes direct circles possible without problems: the parent
> > can reference a child through the proxy and the child can reference the
> > parent directly.
> 
> What you just wrote is:
> 
>     parent --> proxy --> child -->+
>     ^                             v
>     +<----------------------------+
> 
> Looks like a plain old cycle to me!

Sure :-) That was the intention. I'm using this to implement
acquisition without turning to ExtensionClasses. [Nice picture, BTW]
 
> > As soon as the parent is cleaned up, the reference to
> > the proxy is deleted which then automagically makes the
> > back reference in the child disappear, allowing the parent
> > to be deallocated after cleanup without leaving a circular
> > reference around.
> 
> M-A, this is making less sense by the paragraph <wink>:  skipping the
> middle, this says "as soon as the parent is cleaned up ... allowing the
> parent to be deallocated after cleanup".  If we presume that the parent gets
> cleaned up explicitly (since the reference from the child is keeping it
> alive, it's not going to get cleaned up by magic, right?), then the parent
> could just as well call the __cleanup__ methods of the things it references
> directly without bothering with a proxy.  For that matter, if it's the
> straightforward
> 
>     parent <-> child
> 
> kind of cycle, the parent's cleanup method can just do
> 
>     self.__dict__.clear()
> 
> and the cycle is broken without writing a __cleanup__ method anywhere
> (that's what I usually do, and in this kind of cycle that clears the last
> reference to the child, which then goes away, which in turn automagically
> clears its back reference to the parent).
> 
> So, offhand, I don't see that the proxy protocol could help here.  In a
> sense, what's really needed is the opposite:  notifying the *proxy* when the
> *real* object goes away (which makes no sense in the context of what your
> proxy objects were designed to do).

All true :-). The nice thing about the proxy is that it takes
care of the process automagically. And yes, the parent is used
via a proxy too. So the picture looks like this:

--> proxy --> parent --> proxy --> child -->+
              ^                             v
              +<----------------------------+

Since the proxy isn't noticed by the referencing objects (well, at
least if they don't fiddle with internals), the picture for the
objects looks like this:

--> parent --> child -->+
    ^                   v
    +<------------------+

You could of course do the same via explicit invokation of
the __cleanup__ method, but the object references involved could be
hidden in some other structure, so they might be hard to find.

And there's another feature about Proxies (as defined in mxProxy):
they allow you to control access in a much more strict way than
Python does. You can actually hide attributes and methods you
don't want exposed in a way that doesn't even let you access them
via some dict or pass me the frame object trick. This is very useful
when you program multi-user application host servers where you don't
want users to access internal structures of the server.

> [about Java and its four reference strengths]
> 
> Found a good introductory writeup at (sorry, my mailer will break this URL,
> so I'll break it myself at a sensible place):
> 
> http://developer.java.sun.com/developer/
>     technicalArticles//ALT/RefObj/index.html

Thanks for the reference... and for the summary ;-)
 
> They have a class for each of the three "not strong" flavors of references.
> For all three you pass the referenced object to the constructor, and all
> three accept (optional in two of the flavors) a second ReferenceQueue
> argument.  In the latter case, when the referenced object goes away the
> weak/soft/phantom-ref proxy object is placed on the queue.  Which, in turn,
> is a thread-safe queue with various put, get, and timeout-limited polling
> functions.  So you have to write code to look at the queue from time to
> time, to find the proxies whose referents have gone away.
> 
> The three flavors may (or may not ...) have these motivations:
> 
> soft:  an object reachable at strongest by soft references can go away at
> any time, but the garbage collector strives to keep it intact until it can't
> find any other way to get enough memory

So there is a possibility of reviving these objects, right ? 

I've just recently added a hackish function to my mxTools which allows
me to regain access to objects via their address (no, not thread safe,
not even necessarily correct). 
 
sys.makeref(id) 
         Provided that id is a valid address of a Python object (id(object) returns this address),
         this function returns a new reference to it. Only objects that are "alive" can be referenced
         this way, ones with zero reference count cause an exception to be raised. 

         You can use this function to reaccess objects lost during garbage collection.

         USE WITH CARE: this is an expert-only function since it can cause instant core dumps and
         many other strange things -- even ruin your system if you don't know what you're doing ! 

         SECURITY WARNING: This function can provide you with access to objects that are
         otherwise not visible, e.g. in restricted mode, and thus be a potential security hole. 

I use it for tracking objects via id-key based dictionary and
hooks in the create/del mechanisms of Python instances. It helps
finding those memory eating cycles. 

> weak:  an object reachable at strongest by weak references can go away at
> any time, and the collector makes no attempt to delay its death
> 
> phantom:  an object reachable at strongest by phantom references can get
> *finalized* at any time, but won't get *deallocated* before its phantom
> proxy does something or other (goes away? wasn't clear).  This is the flavor
> that requires passing a queue argument to the constructor.  Seems to be a
> major hack to worm around Java's notorious problems with order of
> finalization -- along the lines that you give phantom referents trivial
> finalizers, and put the real cleanup logic in the phantom proxy.  This lets
> your program take responsibility for running the real cleanup code in the
> order-- and in the thread! --where it makes sense.

Wouldn't these flavors be possible using the following setup ? Note
that it's quite similar to your _Weak class except that I use a
proxy without the need to first get a strong reference for the
object and that it doesn't use a weak bit.

--> proxy --> object
                ^
                |
         all_managed_objects

all_managed_objects is a dictionary indexed by address (its id)
and keeps a strong reference to the objects. The proxy does
not keep a strong reference to the object, but only the address
as integer and checks the ref-count on the object in the
all_managed_objects dictionary prior to every dereferencing
action. In case this refcount falls down to 1 (only the
all_managed_objects dict references it), the proxy takes
appropriate action, e.g. raises an exceptions and deletes
the reference in all_managed_objects to mimic a weak reference.
The same check is done prior to garbage collection of the
proxy.

Add to this some queues, pepper and salt and place it in an
oven at 220° for 20 minutes... plus take a look every 10 seconds
or so...

The downside is obvious: the zombified object will not get inspected
(and then GCed) until the next time a weak reference to it is used.

> Java 1.2 *also* tosses in a WeakHashMap class, which is a dict with
> under-the-cover weak keys (unlike Dieter's flavor with weak values), and
> where the key+value pairs vanish by magic when the key object goes away.
> The details and the implementation of these guys waren't clear to me, but
> then I didn't download the code, just scanned the online docs.

Would the above help in creating such beasts ?
 
> Ah, a correction to my last post:
> 
> class _Weak:
>     ...
>     def __del__(self):
>         # this is purely an optimization:  if self gets nuked,
>         # exempt its referent from greater expense when *it*
>         # dies
>         if self.id is not None:
>             __clear_weak_bit(__id2obj(self.id))
>             del id2weak[self.id]
> 
> Root of all evil:  this method is useless, since the id2weak dict keeps each
> _Weak object alive until its referent goes away (at which time self.id gets
> set to None, so _Weak.__del__ doesn't do anything).  Even if it did do
> something, it's no cheaper to do it here than in the systemt cleanup code
> ("greater expense" was wrong).
> 
> weakly y'rs  - tim
> 
> PS:  Ooh!  Ooh!  Fellow at work today was whining about weakdicts, and
> called them "limp dicts".  I'm not entirely sure it was an innocent Freudian
> slut, but it's a funny pun even if it wasn't (for you foreigners, it sounds
> like American slang for "flaccid one-eyed trouser snake" ...).

:-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                   136 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/