[Numpy-discussion] ndarray: subclassing a subclass looses custom attribute

Fri Oct 1 08:20:15 EDT 2010

On Oct 1, 2010, at 1:03 PM, Sebastian Haase wrote:

>>> However, I had done this before for some specific image-file-types:
>>> those would add there own attribute to ndarray array (e.g. arr.Mrc)
>>> Now if I call the new  ndarray_meta on my ndarray_with_mrc I loose the
>>> `Mrc` attribute, leaving only the new `meta` attribute.
>>> My code is essentially a verbatim copy of
>>> http://docs.scipy.org/doc/numpy/user/basics.subclassing.html#simple-example-adding-an-extra-attribute-to-ndarray
>>> 
>>> What can I do ?
> 
> class ndarray_inMrcFile(ndarray):
>    def __new__(cls, input_array, mrcInfo=None):
>        obj = asanyarray(input_array).view(cls)
>        obj.Mrc = mrcInfo
>        return obj
> 
>    def __array_finalize__(self, obj):
>        if obj is None: return
>        self.Mrc = getattr(obj, 'Mrc', None)
> 
> class ndarray_meta(ndarray):
>    def __new__(cls, input_array, meta=None):
>        obj = asanyarray(input_array).view(cls)
>        obj.meta = nd_meta_attribute( meta )
>        return obj
> 
>    def __array_finalize__(self, obj):
>        if obj is None: return
>        self.meta = getattr(obj, 'meta', nd_meta_attribute())

Ah, OK, now I understand the problem.
When you create a ndarray_meta from an object, it is transform into a ndarray_meta at the `asanyarray(...).view(cls)` line, which calls ndarray_meta.__array_finalize__. In that method, you don't keep track of what `obj` is, you just check whether it has a 'meta' attribute: all the other attributes it could have are stripped. If you want to keep them, you have to define them explicitly in your array_finalize... That can be a bit tricky if you want to avoid having your objects subclasses one of the other.

However, these extra attributes should be defined in the __dict__ of your class, right? So, if you update the __dict__ of self with the __dict__ of obj, that should do the trick:

class ndarray_inMrcFile(ndarray):
   def __new__(cls, input_array, mrcInfo=None):
       obj = asanyarray(input_array).view(cls)
       obj.Mrc = mrcInfo
       return obj

   def __array_finalize__(self, obj):
       if obj is None: return
       self.Mrc = getattr(obj, 'Mrc', None)
       self.__dict__.update(getattr(obj, "__dict__", {}))

class ndarray_meta(ndarray):
   def __new__(cls, input_array, meta=None):
       obj = asanyarray(input_array).view(cls)
       obj.meta = nd_meta_attribute( meta )
       return obj

   def __array_finalize__(self, obj):
       if obj is None: return
       self.meta = getattr(obj, 'meta', nd_meta_attribute(None))
       self.__dict__.update(getattr(obj, "__dict__", {}))

That works, but modifying the __dict__ that way can have some nasty side effects (well, I wouldn't be surprised if it had some, more experienced users will comment on that).

A cleaner, albeit slightly more cumbersome and less directly extendable approach, would be to define a generic ndarray subclass, where you define a `_addattr` attribute as a dictionary. Store the attributes specific to your subclasses in that dictionary, and update it in the __array_finalize__: basically, your `_addattr` plays the role of __dict__, but you're not messing w/ __dict__ itself. 
You can access your attributes through this `_addattr` dictionary. As you'll probably need direct attribute access, you can define specific methods, or define the attribute as property:
@property
ndarray_inMrc.MRC(self)
	return self._addattr['MRC']
or with a combo _get_MRC/_set_MRC/MRC=property(_get_MRC,_set_MRC)... You get the idea.

Let me know how it goes. If it works, please consider writing something to add to the doc or the wiki, so that we can keep track of it.

Cheers
P.