Reference count access or better method?

Alex Martelli aleax at aleax.it
Mon Jan 14 10:21:21 EST 2002


"Philip Swartzleonard" <starx at pacbell.net> wrote in message
news:Xns9195AC442CC11RASXnewsDFE1 at 130.133.1.4...
    ...
> > Almost.  weakref.ref and .proxy do let you specify a callback, but that
> > happens when the referent is no longer available.  You could wrap the
> > referent into a highly-transparent-proxy, have all client code refer to
> > the proxy, and when the proxy goes away and your weakref.ref to it
> > triggers your callback, you still have the real object in hand.  I.e.,
> > "there's no ill that can't be cured by another level of
> > indirection":-).
> >
> > The highly transparent proxy can be built by automatic delegation, cfr
> > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/52295
>
> But what if it is then referenced again, how do i determine that change in
> state?

It's not hard to keep track of that, since other code is asking your
system ("the registry") for references to "the referent" -- and you
hand out very-thin-proxies instead.

> If it helps, the intent is to sort of 'de-cache' things, so that if
> an object isn't used for some amount of time, it's memory (gl display
list)
> is destroied, and then it's recreated if it's needed again. But i need the

The "amount of time" is another issue; you'll need to have other kinds
of "events" that cause you to check what's too-old in your (registry)
cache -- nothing in this scheme implies such events, they just let you
USE such events to clean up what parts of your cache you want.

> object--- for the textures example, their identity is determined by thier
> pathname, so i'm going to make a customized list that attempts to
construct
> on first mention, but i can't create these objects from such simple data.
> Or... maybe i can... hm... i already store the paramters for the object, i
> could store them in a meta-object of some sort...

If you can hold other pieces, rather than the true-object, it makes your
task simpler.  But, let's assume that "building the object" is a VERY
time-consuming task.  If you have determined that very often the last
client-code reference to an enregistered object is dropped and then soon
after the reference is requested again, then giving some hysteresis to
the drop-object/rebuild-object pair can substantially increase your
overall performance.

Let's build a sample (and simple) scenario.

Say that you have a function loadit(S) from module M that takes a string
S, presumably a pathname, and does a lot of work then returns object x
of type X.  Leaving loadit bare produces lots of duplicates of object x
as different parts of client-code try to load S, and thus, performance
(as measured by profiling, of course) is unacceptable.

Memoizing in the simplest way, e.g (2.2 syntax for concision &c):

memoX = {}
def loadit(S):
    if S not in memoX:
        import M
        memoX[S] = M.loadit(S)
    return memoX[S]

results in all x's being held "forever", and this eats up too much
memory.  So, we're looking for something much more clever, on the
lines sketched above.  Warning, untested code -- perhaps more
complex than needed (I'd have to think in more detail about it)...:

import weakref, time

class registry:

    class ThinProxy:
        def __init__(self, referent):
            self.__dict__['_thin_proxy_referent'] = referent
        def __getattr__(self, name):
            return getattr(self._thin_proxy_referent, name)
        def __setattr__(self, name, value):
            return setattr(self._thin_proxy_referent, name, value)
        def __delattr__(self, name):
            return delattr(self._thin_proxy_referent, name)

    def __init__(self):
        # *strong* references to memoized objects, keyed by load-strings
        self.memoX = {}
        # *weak* refs to thin-proxies to memoized objects, keyed by
load-strings
        self.proxies = {}
        # reverse mapping of thin-proxies to load strings, keyed by
id(thin_proxy)
        self.load_strings = {}
        # times of last dropping of memoized objects (None if currently
        # being held), keyed by load-strings
        self.last_dropped = {}

    def loadit(self, S):
        if S not in self.memoX:
            import M
            self.memoX[S] = M.loadit(S)
        if S not in self.proxies:
            thin_proxy = self.ThinProxy(memoX[S])
            self.proxies[S] = weakref.ref(thin_proxy, self.dropped)
            self.load_strings[id(thin_proxy)] = S
        self.last_dropped[S] = None
        return self.proxies[S]()

    def dropped(self, weakref2thinproxy):
        key = id(weakref2thinproxy)
        S = self.load_strings[key]
        del self.load_strings[key]
        del self.proxies[S]
        self.last_dropped[S] = time.time()

    def purge(self, last_dropped_before):
        purged = 0
        for S, T in self.last_dropped.items():
            if T is not None and T < last_dropped_before:
                del self.memoX[S]
                del self.last_dropped[S]
                purged += 1
        return purged


I hope the basic idea, at least, is clear, even though the code and
design of this registry are likely to be substantially improvable.

Is this worth it?  Only under truly extreme needs of performance
according to the above-sketched scenario, of course.  But, if one
really must...


Alex






More information about the Python-list mailing list