Assigning to __class__ attribute

Mark Wooding mdw at distorted.org.uk
Fri Dec 3 17:15:54 EST 2010


kj <no.email at please.post> writes:

> >>> class Spam(object): pass
>
> Now I define an instance of Spam and an instance of Spam's superclass:
> >>> x = Spam()
> >>> y = Spam.__mro__[1]() # (btw, is there a less uncouth way to do this???)

There's the `__bases__' attribute, which is simply a tuple of the
class's direct superclasses in order.  Spam.__bases__[0] will always be
equal to Spam.__mro__[1] because of the way the linearization works.

There's also `__base__' attribute, which seems to correspond to a
PyTypeObject's `tp_base' slot; this /isn't/ always the first direct
superclass; I'm not quite sure what the rules are, and it doesn't seem
to be documented anywhere.

> >>> [z.__class__.__name__ for z in x, y]
> ['Spam', 'object']
>
> >>> class Ham(object): pass
> ... 
>
> >>> x.__class__ = Ham
> >>> [isinstance(x, z) for z in Spam, Ham]
> [False, True]
>
> First question: how kosher is this sort of class transmutation
> through assignment to __class__?  

Yep.  That's allowed, and useful.  Consider something like a red/black
tree: red and black nodes behave differently from one another, and it
would be convenient to make use of method dispatch rather than writing a
bunch of conditional code; unfortunately, nodes change between being red
and black occasionally.  Swizzling classes lets you do this.

Various other languages have similar features.  The scariest is probably
Smalltalk's `become: anObject' method, which actually swaps two objects.
Python `__class__' assignment is similarly low-level: all it does is
tweak a pointer, and it's entirely up to the program to make sure that
the object's attributes are valid according to the rules for the new
class.  The Common Lisp Object System has CHANGE-CLASS, which is a
rather more heavyweight and hairy procedure which tries to understand
and cope with the differences between the two classes.

> I've never seen it done.  Is this because it considered something to
> do only as a last resort, or is it simply because the technique is not
> needed often, but it is otherwise perfectly ok?

It's not needed very often, and can be surprising to readers who aren't
familiar with other highly dynamic object systems, so I don't think it's
particularly encouraged; but it seems like it might be the best approach
in some cases.  I think it's one of those things where you'll just
/know/ when it's the right answer, and if you don't know that it's the
right answer, it isn't.

> >>> y.__class__ = Ham
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: __class__ assignment: only for heap types

`Heap types' are types which are allocated dynamically.  Non-heap types
are ones which are dreamt up by C extensions or built into Python.

I suspect that the logic here is that non-heap types in general are
magical and weird, and their instances might have special C stuff
attached to them, so changing their classes is a Bad Idea, though
there's an extra check which catches most problems:

In [1]: class foo (str): pass
In [2]: class bar (object): pass
In [3]: x = foo()
In [4]: x.__class__ = bar
TypeError: __class__ assignment: 'foo' object layout differs from 'bar'

Anyway, I'd guess it's just a bug that `object' is caught this way, but
it doesn't seem an especially important one.

> This definitely rattles my notions of inheritance: since the
> definition of Spam was empty, I didn't expect it to have any
> significant properties that are not already present in its superclass.

Yes, sorry.  I think this is a bit poor, really, but it's hard to do a
much better job.

Pure Python objects are pretty simple things, really: they have a
pointer to their class, and a bunch of attributes stored in a
dictionary.  What other languages call instance variables or methods are
found using attribute lookup, which just searches the object, and then
its class and its superclasses in the method resolution order.  If you
fiddle with `__class__', then attribute lookup searches a different
bunch of classes.  Nothing else needs to change -- or, at least, if it
does, then you'd better do it yourself.

Types implemented in C work by extending the underlying Python object
structure.  The magical type-specific stuff is stored in the extra
space.  Unfortunately, changing classes is now hard, because Python code
can't even see the magic C stuff -- and, besides, a different special C
type may require a different amount of space, and the CPython
implementation can't cope with the idea that objects might move around
in memory.

Similar complications occur if one of the classes has a `__slots__'
attribute: in this case, both the original and new class must have a
`__slots__' attribute and they must be the same length and have the same
names.

> What is the most complete, definitive, excruciatingly detailed
> exposition of Python's class and inheritance model?

It's kind of scattered.  The language reference sections 3.3 and 3.4
contain some useful information, but it's rather detailed and it's a
(mostly) comprehensive list of what a bunch of strangely shaped things
do rather than a presentation of a coherent model.

Guido's essay `Unifying types and classes in Python 2.2' is pretty good
(http://www.python.org/download/releases/2.2.3/descrintro/) and provides
some of the background.

Unsurprisingly, Python's object system takes (mostly good) ideas from
other languages, particularly dynamic ones.  Python's object system is
/very/ different from what you might expect from C++, Ada and Eiffel,
for example; but coming from Smalltalk, Flavors or Dylan, you might not
be particularly surprised.  Some knowledge of these other languages will
help fill in the gaps.

> I'm expressly avoiding Google to answer this question,

I'd say this was sensible, except for the fact that you seem to expect a
more reliable answer from Usenet. ;-)

-- [mdw]



More information about the Python-list mailing list