Implied instance attribute creation when referencing a class attribute

Mon Jan 16 17:11:25 EST 2006

I just ran across a case which seems like an odd exception to either
what I understand as the "normal" variable lookup scheme in an
instance/object heirarchy, or to the rules regarding variable usage
before creation.  Check this out:

>>> class foo(object):
...   I = 1
...   def __init__(self):
...     print self.__dict__
...     self.I += 1
...     print self.__dict__
...
>>> a=foo()
{}
{'I': 2}
>>> foo.I
1
>>> a.I
2
>>> del a.I
>>> a.I
1
>>> del a.I
Traceback (most recent call last):
  File "<string>", line 1, in <string>
AttributeError: I
>>> non_existent_var += 1
Traceback (most recent call last):
  File "<string>", line 1, in <string>
NameError: name 'non_existent_var' is not defined

In this case, 'self.I += 1' clearly has inserted a surprise
behind-the-scenes step of 'self.I = foo.I', and it is this which I find
interesting.

As I understand it, asking for self.I at this point should check
self.__dict__ for an 'I' entry, and if it doesn't find it, head on up
to foo.__dict__ and look for it.

So... I initially *thought* there were two possibilities for what would
happen with the 'self.I += 1':
  1. 'self.I += 1' would get a hold of 'foo.I' and increment it
  2. I'd get an AttributeError

Both were wrong.  I thought maybe an AttributeError because trying to
modify 'self.I' at that point in the code is a bit fuzzy... ie: am I
really trying to deal with foo.I (in which case, I should properly use
foo.I) or am I trying to reference an instance attribute named I (in
which case I should really create it explicitly first or get an error
as with the non_existent_var example above... maybe with 'self.I =
foo.I').

Python is obviously assuming the latter and is "helping" you by
automatically doing the 'self.I = foo.I' for you.  Now that I know this
I (hopefully) won't make this mistake again, but doing this seems
equivalent to taking my 'non_existent_var += 1' example above and
having the interpreter interpret as "oh, you must want to deal with an
integer, so I'll just create one for you with 'non_existent_var = 0'
first".  Fortunately this is not done, so why do it with the instance
attribute reference?

Does anyone have any solid reasoning behind the Python behavior?  It
might help drive it home more so than just taking it as "that's the way
it is" and remembering it.

It gets even more confusing for me because the behaviour could be
viewed as being opposite when dealing with mutable class members.  eg:

>>> class foo(object):
...   M = [1,2,3]
...   def __init__(self):
...     self.M.append(len(self.M) + 1)
...     print self.M
...
>>> a=foo()
[1, 2, 3, 4]
>>> foo.M
[1, 2, 3, 4]
>>> del a.M
Traceback (most recent call last):
  File "<string>", line 1, in <string>
AttributeError: 'foo' object attribute 'M' is read-only

By opposite I mean that with immutable objects, a sloppy self.I
reference doesn't get you to the base class object, whereas with a
mutable one you do get to the base object (although I do recognize that
in both cases if you just remember that the interpreter will always
stuff in a 'self.x = BaseClass.x' it works as expected in both the
immutable and mutable case).

After all that, I guess it boils down to me thinking that the code
*should* interpret the attempted instance modification with one of the
two possibilities I mentioned above (although after typing this I'm now
leaning more towards an AttributeError rather than allowing 'self.I' to
be synonymous with 'foo.I' if no local override).

Russ

PS: Apologies if I mangled the "proper" terminology for talking about
this... hopefully it makes sense.