Bypassing __getattribute__ for attribute access

Bruno Desthuilliers bruno.42.desthuilliers at wtf.websiteburo.oops.com
Thu Oct 25 12:35:22 EDT 2007


Adam Donahue a écrit :
> As an exercise I'm attempting to write a metaclass that causes an
> exception to be thrown whenever a user tries to access
> 'attributes' (in the traditional sense) via a direct reference.

I guess you're new to Python, and coming from either C++ or Java. Am I 
wrong ?-)

And without even reading further, I can tell you're doing something 
pretty complicated that just don't work.

(Ok, I cheated - I did read further !-)

> Consider:
> 
>     class X( object ):
>         y = 'private value'
>         def get_y( self ): return self.y
> 
> Normally one can access y here via:
> 
>     X().y
> 
> or
> 
>     X().get_y()
> 
> I want the former case, however, to throw an exception.

So called "private" or "protected" attributes (IOW, implementation stuff 
the client code should not mess with) are denoted by a single leading 
underscore. IOW, 'y' should be '_y'. It won't of course prevent anyone 
to access the attribute, but then it's not your responsability anymore.

I know this sound surprising to C++/Java programmers, but experience 
prove that it just work.


Now if all you want is to reproduce the Java systematic-getter-setter 
dance - that is, use getters/setters 'just in case' you'd want to 
refactor (which, FWIW, is the only rationale behind accessors), you just 
don't need this with Python. We do have computed attributes here, so the 
simplest thing is to start with a plain attribute, then refactor it into 
a computed one if and when the need arises. This is *totally* 
transparent to client code.




> I figured the way to do this would be to introduce a metaclass that
> overrides the default __getattrribute__ call and throws an exception.
> So my first attempt was something like:
> 
>     class XType( type ):
>         def __my_getattribute__( self, name ):
>              raise AttributeError()
>         def __init__( klass, name, bases, dict ):
>             super( XType, klass ).__init__( name, bases, dict )
>             setattr( klass, '__getattribute__',
> klass.__my_getattribute__ )
> 
> But whereas the X().y attribute behaves as I intend, the X().get_y()
> returns raises that exception as well:

Indeed. __getattribute__ is invoked for *each and every* attribute 
lookup - including methods, since methods are attributes too. FWIW, 
__getattribute__ it's something you should not mess with unless you know 
what you're doing and why you're doing it.

> 
> So it looks as if 'attribute' here means any key in self.__dict__,

The '.' is the lookup operator. As soon as you have obj.anyname, you do 
an attribute lookup (wether it fails or succeeds is another question). 
And __getattribute__ is the implementation for this operator. So given 
how you wrote your custom __getattribute__, you just made attribute 
lookup impossible.

And FWIW, attribute lookup is much more complex than just looking up the 
instance's __dict__ - it also looks up the class __dict__, then the 
parent's classes __dict__, then look for a custom __getattr__ method 
(which is used when the attribute has not been found so far). And if the 
attribute found is a class attribute that implements the descriptor 
protocol, then __getattribute__ is responsible for invoking this 
protocol. IOW, __getattribute__ is one of the most critical magic methods.

> whether referenced via self.var, self.__dict__['var'] (because this
> references __dict__), or getattr( self, 'var' ) (which is the same as
> a direct self.var access, I believe).

Practically, yes.

> 
> So I tried:
> 
>     class XType( type ):
>         def __my_getattribute__( self, name ):
>             if name != '__dict__':
>                 raise AttributeError()
>             return super( self.__class__,
> self ).__getattribute__( name )
>         def __init__( klass, name, bases, dict ):
>             super( XType, klass ).__init__( name, bases, dict )
>             setattr( klass, '__getattribute__',
> klass.__my_getattribute__ )
> 
> This allows me to access X().__dict__ directly (and then
> X().__dict__['y']), but it still limits caller access to the get_y()
> method.

cf above.

> It sounds then like the "solution" will be to check whether the name
> referenced is called __dict__ or is a method or function type,
> otherwise throw the exception, and to ensure all internal calls are
> handled via self.__dict__[name] not self.name.

My my my. Trouble ahead...

> Something like:
> 
>     import types
>     class XType( type ):
>         def __my_getattribute__( self, name ):
>             if name != '__dict__' and not
> isinstance( self.__dict__[name], types.FunctionType ):
>                 raise AttributeError()
>             return super( self.__class__,

*never* use self.__class__ (or type(self) etc) when calling super(). You 
*really* want to pass the exact class here - else you'll have *very* 
strange results.

> self ).__getattribute__( name )
>         def __init__( klass, name, bases, dict ):
>             super( XType, klass ).__init__( name, bases, dict )
>             setattr( klass, '__getattribute__',
> klass.__my_getattribute__ )

My my my...

> Of course this is imperfect as a user can simply bypass the
> __getattribute__ call too and access __dict__ directly,

Indeed. The fact is that there's just no way to prevent client code to 
access your implementation. Period. So relax, stop fighting against the 
langage, and learn to use it how it is.

> but it's
> closer to what I was thinking.  The problem is the return value for
> functions is not bound - how do I bind these to the associated
> instance?

func.__get__(obj, obj.__class__)

But that should not be done when the function is an instance attribute - 
only when it's a class one. And any class attribute implementing the 
descriptor protocol should be treated that way.

> (Caveat - I am not sure whether using __get__ itself in lieu of
> __getattribute__ would be a better solution; but I would like to see
> how binding would be done here for general knowledge.)

(simplified) In the normal case, when the attribute looked up happens to 
be a class attribute and implements the descriptor protocol, 
__getattribute__ returns attr.__get__(obj, type(obj). What attr.__get__ 
returns is up to whoever implemented type(attr). In the case of 
functions, anyway, __get__ returns a method object, which is a callable 
object wrapping the function, the target object and the class. When 
called, this method object insert the target object (or class if it's a 
classmethod) in front of the args list, and invoke the function with 
this new args list. Which is why you need to declare self (or cls) as 
first argument of a 'method' but not to explicitely pass it at call time.

Anyway : forget about "real" privacy in Python (FWIW, neither Java nor 
C++ have "real" privacy - there are ways to bypass access restrictors in 
both languages), just use the single leading underscore convention and 
you'll be fine. And don't write explicit accessors - in fact, don't 
write accessors at all until you need them, and when you need them, use 
a property or a custom descriptor object, so it's transparant to client 
code.

HTH



More information about the Python-list mailing list