[Python-Dev] Re: Of slots and metaclasses...

Kevin Jacobs jacobs@penguin.theopalgroup.com
Thu, 28 Feb 2002 15:48:05 -0500 (EST)


On Thu, 28 Feb 2002, Guido van Rossum wrote:
> [Kevin Jacobs wrote me in private to ask my position on __slots__.
> I'm posting my reply here, quoting his full message -- I see no reason
> to carry this on as a private conversation.  Sorry, Kevin, if this
> wasn't your intention.]

No problem -- I sent it privately only to spare python-dev if you happened
to be too busy for a coherent reply.

> Hi Kevin, you got me to finally browse the thread "Meta-reflections".
> My first response was: "you've got it all wrong."  My second response
> was a bit more nuanced: "that's not how I intended it to be at all!"
> OK, let me elaborate. :-)

Yes -- I can see why my initial efforts of making slots work "just like
__dict__ attributes" is a bad idea.  However, it took reading 'Putting
Metaclasses to Work' for me to realize that.

> You want to be able to find out which instance attributes are defined
> by __slots__, so that (by combining this with the instance's __dict__)
> you can obtain the full set of attribute values.  But this defeats the
> purpose of unifying built-in types and user-defined classes.

I suppose the purpose of unifying built-in types and user-defined classes is
rather subjective.  There are many roads that will get us there, and I
happened to fixate on another one...

> A new-style class, with or without __slots__, should be considered no
> different from a new-style built-in type, except that all of the
> methods happen to be defined in Python (except maybe for inherited
> methods).

Sure.  Except that I also want to be able to extend existing new-style
classes/types in C, as well as Python.  Here is how I do it now (minus error
checking and ref-counting):

static PyMethodDef PyRow_methods[] = {
        {"__init__",      (PyCFunction)rowinit,       METH_VARARGS},
        {"__repr__",      (PyCFunction)rowstrrepr,    METH_NOARGS },
        {"__getitem__",   (PyCFunction)rowgetitem,    METH_VARARGS}
        /* etc... */ }

  PyRow_Type = (PyTypeObject*)PyType_Type.tp_call((PyObject*)&PyType_Type,args, NULL)

  /* Methods must be added _after_ PyRow_Type has been created
    since the type is an argument to PyDescr_NewMethod */
  dict = PyRow_Type->tp_dict;
  meth = PyRow_methods;
  for (; meth->ml_name != NULL; meth++)
  {
      PyObject* method = PyDescr_NewMethod(PyRow_Type, meth);
      PyDict_SetItemString(dict,meth->ml_name,method);
  }

Though this doesn't look nearly as ugly as it did when I first wrote it,
before I read 'Putting Metaclasses to Work'; strangely enough it ends up
looking a lot like their metaclass interface.

> In order to find all attributes, you should *never* look at __slots__.
> Your should search the __dict__ of the class and its base classes, in
> MRO order, looking for descriptors, and *then* add the keys of the
> __dict__ as a special case.  This is how PEP 252 wants it to be.

Sure.  I was just hoping to have that list of descriptors pre-computed and
stored in the class (like __mro__).  I suppose the question is why even
expose __slots__ if it is so worthless?

> If the descriptors don't tell you everything you need, too bad -- some
> types just are like that.

This has _never_ been a concern of mine --  I don't mind if the C
implementation chooses to hide things.

> Why do I reject your suggestion of making __slots__ (more) usable for
> introspection?  Because it would create another split between built-in
> types and user-defined classes: built-in types don't have __slots__,
> so any strategy based on __slots__ will only work for user-defined
> types.  And that's exactly what I'm trying to avoid!

Well, I'm busing creating C extension types that *do* have slots!  One of my
many current projects is to create a better type to store the results of
relational database queries.  I want the memory efficiency of tuples and the
ability to query by name (via __getitem__ or __getattr__).  So I basically
need to re-invent a magic tuple type that adds descriptors for every named
field.  Strangely enough, this is basically what the slots mechanism does.
I do realize that I could accomplish the same end by sub-classing tuple and
adding a bunch of descriptors.

> Given this viewpoint, you won't be surprised that I have little desire
> to implement your other proposals, in particular, I reject all these:
>
> - Proxy the instance __dict__ with something that makes the slots
>   visible

I wasn't real thrilled with this idea myself.  Among all the other reasons
why not to do this, it has some terrible performance implications.

> - Flatten slot lists and make them immutable

Again, why even have __slots__ if they are so useless?  Assuming that there
is a legitimate reason to peek at __slots__, why not at least make them
immutable?  Or, even better, why not use __slots__ to expose the etype slot
tuple instead?

> - Alter vars(obj) to return a dict of all attrs

Ok, I'm a little baffled by this.  Why not?

> I'll be the first to admit that some details are broken in 2.2.
>
> In particular, the fact that instances of classes with __slots__
> appear picklable but lose all their slot values is a bug -- these
> should either not be picklable unless you add a __reduce__ method, or
> they should be pickled properly.

My vote is that they should be pickled properly by default.  In my mind,
slots are a more static type of attribute.  Since they are more static, my
feeling is that they should be as or more accessible than dict attributes.
Descriptors are fine for handing the black magic of making them addressable
by name, but it just feels wrong to hide them from access by other means.
Of course, I am really talking about slots defined at the Python level --
not necessarily all storage allocated in the 'members' array.

> I'm not so sure that the fact that you can "override" or "hide" slots
> defined in a base class should be classified as a bug.  I see it more
> as a "don't do that" issue: If you're deriving a class that overrides
> a base class slot, you haven't done your homework.  PyChecker could
> warn about this though.

Unless attribute access becomes scoped based on the static type of the
method, then I think it is a bug.  Re-declared slots become effectively
orphaned and just waste memory.  Coalescing them or raising an exception
when they are re-declared seem much better alternatives.

> I think you're mostly right with your proposal "Update standard
> library to use new reflection API".  Insofar as there are standard
> support classes that use introspection to provide generic services for
> classic classes, it would be nice of these could work correctly for
> new-style classes even if they use slots or are derived from
> non-trivial built-in types like dict or list.> This is a big job, and
> I'd love some help.  Adding the right things to the inspect module
> (without breaking pydoc :-) would probably be a first priority.

Well, I'm happy to contribute, though my primary concern (other than
correctness and completeness) is efficiency.  The whole reason I'm using
slots is to save space when allocating huge numbers of fairly small objects.
I believe that there is a big performance difference between being able to
pickle based on arbitrary descriptors and pickling just slots.  Slots are
already nicely laid out in rows, just waiting to be plucked out and stuffed
into a pickle.  Even without flattened __slots__ lists, it is a fast and
trivial operation to iterate over a class and all its bases and extract
slots.  Doing so over dictionaries is not nearly so trivial.

> Maybe you can formulate it as a set of tentative clarifying patches to
> PEPs 252, 253, and 254?

To be honest, I forgot that those PEPs existed!  I've been working off of
the Python 2.2 source and the tutorials.  I'll read them over tonight and
see.

> >   2) In Python 2.2, what intentional deviations have you chosen from the
> >      SOMMCP and what differences are incidental or accidental?
>
> Hard to say, unless you specifically list all the things that you
> consider part of the SOMMCP.

When I say SOMMCP, I really mean the "metaclass protocol" defined by the
various postulates and theorems in the first few chapters of the book.

> - I currently don't complain when there are serious order
>   disagreements.  I haven't decided yet whether to make these an error
>   (then I'd have to implement an overridable way of defining
>   "serious") or whether it's more Pythonic to leave this up to the
>   user.

Sure -- I noticed this.  Maybe you should store the order-safety in the
metaclass?  That way, the user can inspect it when they decide it is
important.

> - I don't enforce any of their rules about cooperative methods.  This
>   is Pythonic: you can be cooperative but you don't have to be.  It
>   would also be too incompatible with current practice (I expect few
>   people will adopt super().)

I agree with most of that, except that I expect that MANY people will start
using 'super'.  I've trained an office full of Java programmers to
program in Python and they are always complaining about the lack of super
calls.  Also, I've _always_ considered this idiom ugly and hackish:

  def Foo(Bar,Baz):
    def __init__(self):
      Bar.__init__(self)
      Baz.__init__(self)

Its so much better as:

  def Foo(Bar,Baz):
    def __init__(self):
      # when super becomes a keyword and we write nice cooperative __init__
      # methods
      super.__init__(self)

> - I don't automatically derive a new metaclass if multiple base
>   classes have different metaclasses.

I have my own ideas about this, but like you, don't have enough experience
with them in practice to do anything about it.

>   Since I expect that non-trivial metaclasses are
>   often implemented in C, I'm not so comfortable with automatically
>   merging multiple metaclasses -- I can't prove to myself that it's
>   always safe.

It is always safe when the assumption of monotonicity is not violated.

> - I don't check that a base class doesn't override instance
>   variables.  As I stated above, I don't think I should, but I'm not
>   100% sure.

Do you mean slots or all Python instance attributes in this statement?

> >   3) Do you intend to enforce monotonicity for all methods and slots?
> >      (Clearly, this is not desirable for instance __dict__ attributes.)
>
> If I understand the concept of monotonicity, no.  Python traditionally
> allows you to override methods in ways that are incompatible with the
> contract of the base class method, and I don't intend to forbid this.

For Python, monotonicity means that the instance attributes and instance
methods of a class are a superset of those of all its ancestors.  This is
not the way that normal __dict__ attributes work in Python, so lets talk
only about slots when discussing monotonic properties.  In order words, it
means that the metaclass interface does not provide a way to delete a slot
or a method, only ways to add and override them.  Combined with some static
type information, the assumption of monotonicity will be very helpful when
we can eventually compile Python.

> It would be good if PyChecker checked for accidental mistakes in this
> area, and maybe there should be a way to declare that you do want this
> enforced; I don't know how though.

I have a pretty good idea how.  Its essentially a proof-based method that
works by solving metatype constraints.

> There's also the issue that (again, if I remember the concepts right)
> there are some semantic requirements that would be really hard to
> check at compile time for Python.

True for __dict__ instance attributes, not for slots!

> >   4) Should descriptors work cooperatively?  i.e., allowing a
> >      'super' call within __get__ and __set__.
>
> I don't think so, but I haven't thought through all the consequences
> (I'm not sure why you're asking this, and whether it's still a
> relevant question after my responses above).  You can do this for
> properties though.

  class Foo(object):
    __slots__=()
    a = 1

  class Bar(Foo):
    __slots__ = ('a',)

  bar = Bar()
  print dir(a)
  print a

The resolution rule for descriptors could work cooperatively to find Foo's
class attribute 'a' instead of giving up with an AttributeError.

Thanks for the very useful answers,
-Kevin

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19         E-mail: jacobs@theopalgroup.com
Fax:   (216) 986-0714              WWW:    http://www.theopalgroup.com