Python Descriptor as Instance Attribute

Wed Feb 1 01:41:27 EST 2012

Thanks Ian for the explanation.
Please see my comments below:

> The behavior is by design.  First, keeping object behavior in the
> class definition simplifies the implementation and also makes instance
> checks more meaningful.  To borrow your Register example, if the "M"
> descriptor is defined by some instances rather than by the class, then
> knowing that the object "reg" is an instance of Register does not tell
> me anything about whether "reg.M" is a valid attribute or an error.
> As a result, I'll need to guard virtually every access of "reg.M" with
> a try-except construct just in case "reg" is the wrong kind of
> register.
I don't quite understand the above explanation. Sorry I'm not very familiar
with the low level details, but from a user's point of view, if I defined reg.M,
then it should be a valid access later on.... somehow. :-)

> Second, the separation of class from instance also helps you keep
> object behavior separate from object data.  Consider the following
> class:
>
> class ObjectHolder(object):
>    def __init__(self, obj):
>        self.obj = obj
>
> Don't worry about what this class might be useful for.  Just know that
> it's meant to hold and provide unrestricted access to arbitrary Python
> objects:
>
>>>> holder = ObjectHolder(42)
>>>> print(holder.obj)
> 42
>>>> holder.obj = range(5)
>>>> print(holder.obj)
> [0, 1, 2, 3, 4]
>
> Since the class is meant to hold arbitrary objects, it's even valid
> that somebody might want to store a descriptor object there:
>
>>>> holder.obj = property(lambda x: x.foo)
>>>> print(holder.obj)
> <property object at 0x02415AE0>
>
> Now suppose that Python invoked the descriptor protocol for
> descriptors stored in instance attributes:
>
>>>> holder = ObjectHolder(None)
>>>> holder.obj = property(lambda x: x.foo)
>>>> print(holder.obj)
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
> AttributeError: 'ObjectHolder' object has no attribute 'foo'
>
> In this case, the ObjectHolder would fail to simply hold the property
> object as data.  The mere act of assigning the property object, a
> descriptor, to an instance attribute would *change the behavior* of
> the ObjectHolder.  Instead of treating "holder.obj" as a simple data
> attribute, it would start invoking the descriptor protocol on accesses
> to "holder.obj" and ultimately redirect them to the non-existent and
> meaningless "holder.foo" attribute, which is certainly not what the
> author of the class intended.
OK I see some fundamental problems here now. And I think that's actually
one of the limitations of descriptor: A descriptor only works when it is
defined as class attribute and accessed from the instance. It really can be
much more powerful if there can be a general way to define an attribute on
either a class or an instance, but the access to it (either directly
from class or
from its instance) actually calls a function.
It will make some kind of abstraction much more clean and simple in concept,
like my example above, I have one class called register, and all of its instance
represent different registers with different field, and assignment to its field
automatically checks for validations, and read automatically fetches the value
from the hardware.

> For the above reasons, I would probably implement your Register class
> as a set of related class sharing a common metaclass.  The solution
> you came up with is probably fine to solve your specific problem,
> though.
this like I said before is not conceptually simple enough, and it can confuses
end user if they're not python expert. For years I loved python is because I can
always figure out a best way to abstract a problem, and make end-user interface
as simple as possible, I never failed before with python, but this time it seems
python indeed have limitations here, or does there exist a better solution?

To make you understand the problem I'm facing, I'd like to elaborate a
bit more here.
Registers in SoC peripherals have different field, and each field have
different number
of bits, different access capabilities (read only, write only, read write, ...),
but all registers share some common attribute, like they're all 32 bits long.
Also some common operations is shared, like distribute a value to each
bit field,
meaning that set the value of a register as a whole will automatically
update each field.

The definition of each register is in an XML file with all attribute
for each field.
And the preferred way to generate an "representation" of a register is
to instantiate
the Register class with its definition read from the xml file. This is
the natural design,
all register representation is an instance of Register class.

So now the problem is, how can I have such a simple class with all its
instance have
different fields, which can be written and read directly (like reg.M =
'101', or x = reg.M)
with automated validation check and value fetch?

Define a separate class for each register doesn't sounds feasible
because there's hundreds
of registers. Using metaclass to generate a class for each register
also doesn't feel good,
because you still need to instantiate them *once again* to get the
instance that actually
invokes the descriptor protocols ...

Your input is highly appreciated.

Best Regards,
Yanghao