what's unsafe to do in a getattr?

Sun Oct 24 21:10:57 EDT 2021

Have run into a problem on a "mature" project I work on (there are many 
years of history before I joined), there are a combination of factors 
that combine to trigger a KeyError when using copy.copy().

I don't want to write a massive essay here but hoping to give enough to 
set the context.

There's a class that's a kind of proxy, so there's some "magic" that 
could be present. The magic is detected by looking for a kind of memo 
annotation, so the __getattr__ starts with this:

     # Methods that make this class act like a proxy.
     def __getattr__(self, name):
         attr = getattr(self.__dict__['__subject'], name)

and that's what blows up. It happens for a user doing something we... 
ahem... don't expect.  They just picked up the Py3-only version of the 
project and now they're getting the issue.

Nothing in the project defined a __reduce__ex__ function, but one is 
picked up from the base "object" type, so copy.copy generates some 
pickle information and passes it to copy._reconstruct as the state 
parameter. This stanza:

     if state is not None:
         ...
         if hasattr(y, '__setstate__'):
             y.__setstate__(state)

so our class's __getattr__ is called to look for __setstate__.  But at 
this stage, the copy's instance has only been created, the operations 
that will fill in the details haven't happened yet, so we take a KeyError.

So apparently the attempt in the __getattr__ to go fishing in our own 
dict for something we set ourselves is unsafe.  Is there a guideline for 
what you can / cannot expect to be safe to do?  My naiive expectations 
would be that when __getattr__ is called, you can expect an instance to 
have been already initialized, but if I'm not reading the copy module 
wrong, that's not always true.

Is a better answer for this class to provide a __copy__ method to more 
precisely control how copying happens?

what's unsafe to do in a __getattr__?

what's unsafe to do in a getattr?