Overriding a method at the instance level on a subclass of a builtin type

Aaron Brady castironpi at gmail.com
Thu Dec 4 05:38:36 EST 2008


On Dec 3, 1:25 pm, Jason Scheirer <jason.schei... at gmail.com> wrote:
> On Dec 2, 6:13 pm, Aaron Brady <castiro... at gmail.com> wrote:
> > >>> class A:
>
> > ...     def methA( self ):
> > ...             print 'methA'
> > ...             self.meth= self.methB
> > ...     meth= methA
> > ...     def methB( self ):
> > ...             print 'methB'
> > ...>>> a= A()
> > >>> a.meth()
> > methA
> > >>> a.meth()
>
> > methB
>
> The problem with using this this pattern in the way that you've
> specified is that you have a potential memory leak/object lifetime
> issue. Assigning a bound method of an instance (which itself holds a
> reference to self) to another attribute in that same instance creates
> a kind of circular dependency that I have discovered can trip up the
> GC more often than not.
>
> You can subclass it as easily:
>
> class dictsubclass(dict):
>     def __getitem__(self, keyname):
>         if not hasattr(self, '_run_once'):
>             self.special_code_to_run_once()
>             self._run_once = True
>         return super(self, dict).__getitem__(keyname)
>
> If that extra ~16 bytes associated with the subclass is really a
> problem:
>
> class dictsubclass(dict):
>     def __getitem__(self, keyname):
>         self.special_code_to_run_once()
>         self.__class__ = dict
>         return super(self, dict).__getitem__(keyname)
>
> But I don't think that's a good idea at all.

Interesting.  The following code ran, and process memory usage rose to
150MB.  It failed to return to normal afterward.

>>> for x in range( 10000000 ):
...     a= []
...     a.append(a)
...

However, the following code succeeded in returning usage to normal.

>>> import gc
>>> gc.collect()

It was in version 2.6.  So, the GC succeeded in collecting circularly
linked garbage when invoked manually.  That might have implications in
the OP's use case.

In another language, it would work differently, if it lacked unbound
method descriptors.  C++ for example, untested:

class C {
public:
  func_t meth;
  C( ) { meth= methA; }
  void methA( ) { meth= methB; }
  void methB( ) { }
};

It has no problems with memory consumption (an extra pointer per
object), or circular references; functions are not first-class
objects.  However they are in Python, which creates an entire bound
method object per instance.

The OP stated:

> run some code and then patch in the original dict
> method for the instance to avoid even the check to see if the init
> code has been run.

So your, Arnaud's, and Bryan's '.__class__' solution is probably best,
and possibly even truer to the intent of the State Pattern.

It is too bad that you can't assign an unbound method to the member,
and derive the bound method on the fly.  That might provide a middle-
ground solution.





More information about the Python-list mailing list