[Python-3000] Generic functions vs. OO

Guido van Rossum guido at python.org
Fri Nov 24 06:34:49 CET 2006


Phillip J. Eby wrote:
> At 08:29 PM 11/22/2006 -0800, Guido van Rossum wrote:
> >One thing that rubs me the wrong way about generic functions is that
> >it appears to go against OO. Now I'm not someone to take OO as
> >religion, but there's something uncomfortable (for me) about how, in
> >Phillip's world, many things become functions instead of methods,
> >which brings along concerns about the global namespace filling up, and
> >also about functionality being spread randomly across too many
> >modules. I fear I will miss the class as a convenient focus for
> >related functionality.
>
> I originally proposed a solution for this back in January '05, but it was
> too premature.  But since you have now stated the problem that the proposal
> was intended to solve, perhaps the solution has a chance now.  :)
>
> I will try to be as concrete as possible.  Let's start with an actual,
> hopefully non-exploding 'Interface' implementation, based on an assumption
> that we have generic functions available:

Could you point out where the example uses generic functions?

>      class InterfaceClass(type):
>          def __init__(cls, name, bases, cdict):
>              for k,v in cdict.items():
>                  # XXX this should probably skip at least __slots__,
>                  #     __metaclass__, and __module__, but oh well
>                  cdict[k] = AdaptingDescriptor(v)
>
>      class Interface:
>          __metaclass__ = InterfaceClass
>          __slots__ = '__self__'
>
>          def __init__(self, subject):
>              # this isinstance() check should be replaced by an
>              # 'unwrap()' generic function, so other adapter types
>              # will work, but this is just an example, so...
>              if isinstance(subject, Interface):
>                  subject = subject.__self__
>              self.__self__ = subject
>
>      class AdaptingDescriptor:
>          def __init__(self, descriptor):
>              self.wrapped = descriptor
>          def __get__(self, ob, typ=None):
>              if ob is None:
>                  return self
>              return self.wrapped.__get__(ob.__self__, typ)

OK, that didn't explode my head, but it went straight over it. It may
only be 20 or so lines of code, but it's so concentrated that I can't
really understand it. All I get is that there's a class named
Interface that does some magic. (I'm sure I could understand it, but
it would take me a whiteboard and half an hour close reading of the
code, and I just don't want to spend that time right now.)

> Now, using this new "interface framework", let's implement a small
> "mapping" typeclas... er, interface.
>
>      class Mapping(Interface):
>          def keys(self):
>              return [k for k,v in self.items()]
>          def items(self):
>              return [k,self[k] for k in self.keys()]
>          # ... other self-recursive definitions
>
> What does this do?  Well, we can now call Mapping(foo) to turn an arbitrary
> object into something that has Mapping's generic functions as its methods,

What are Mapping's generic functions? keys and items? How can I tell
they are generic? What made them generic?

> and invokes them on foo!  (I am assuming here that normal functions are
> implicitly overloadable, even if that means they change type at runtime to
> do so.)

I read your explanation of that assumption before but I still don't
like it. I'd much rather have a slot where the overloading gets stored
that's NULL if there's no overloading. We can optimize the snot out of
it some other way. (I guess my problem with objects that change type
is that it's not a common thing to happen in Python, even though it's
explicitly possible, and I expect that it will cause all sorts of
bizarre surprises for code that thinks it understands Python's object
model. Anyway, it's a distraction. Maybe we can put the class-changing
idea aside as a premature optimization.)

> We could even use interfaces for argument type declarations, to
> automatically put things in the "right namespace" for what the code expects
> to use.

What do you mean by "the right namespace"? What things are put there?

> That is, if you declare an argument to be a Mapping, then that's
> what you get.

Sounds like implied adaptation to me, which was already explicitly
rejected long ago.

> If you call .keys() on the resulting adapted object and the
> type doesn't support the operation, you get an error.

How could it not support it if Mapping(x) succeeded?

How has keys() suddenly turned from a method into an operation
(assuming "operation" is a technical term and not just a different
word for method)?

> Too late a form of error checking you say?  Well, make a more sophisticated
> factory mechanism in Interface.__new__ that actually creates (and caches)
> different adapter types based on the type of object being adapted, so that
> hasattr() tests will work on the wrapped type, or so that you can get an
> early error if none of the wrapped generic functions has a method defined
> for the target type.
>
> A few important points here:
>
> 1. A basic interface mechanism is extemely simple to implement, given
> generic functions

Given my inability to understand it I can't yet agree with this claim.

> 2. It is highly customizable with respect to error checking and other
> features, even on a per-user basis, because there doesn't have to be only
> one "true" Interface type to rule them all (or one true generic function
> type either, but that's a separate discussion).

I would think there are few approaches (in Python) that would require
"one true interface type". Maybe Zope/Twisted do, but probably more
for expedience than because it's the only way it could be done.

> 3. It allows interfaces to include partial implementations, ala Ping and
> Alex's past proposals, thus allowing you to implement partial mapping or
> "file" objects and have the rest of the interface's implementation filled
> in for you

I'm not even sure I like that. It also seems to hinge on adaptation.
I'd rather *not* make adaptation a key concept; to me adaptation and
interfaces are completely orthogonal.

I think of interfaces as ways to *talk* about sets of methods or
operations or abilities, and I like that interfaces are objects so you
can inspect what you're talking about, but I don't like the interface
to play a role in the actual *use* of an object. Perhaps as analogy,
the use of hasattr in current Python will help: I can test whether an
object has an attribute, and if it does, use the attribute. But the
attribute test is not used as an intermediary for using the attribute.
Similar with isinstance -- I can test whether an object is a
basestring, and if it is, use its lower() method. But the basestring
object isn't involved in using the lower() method. This is somewhat in
contrast to Java and other statically typed languages with dynamic
typecasts -- there you might have an Object x that you believe is
really a String, so you write "String s = (String)x" (I believe that's
the syntax -- I'm a bit rusty) and then you can use String methods on
s -- but not on x. I find that aspect unPythonic.

> 4. It allows you to hide the very existence of the notion of a "generic
> function", if you prefer not to think about such things

You've hidden them so well that I can't find them in your example. :-)

> 5. It even supports interface inheritance and interface algebra:
> subclassing an interface allows adding new operations, and simple
> assignment suffices to compose new interfaces, e.g.:
>
>      class MappingItems(Interface):
>          items = Mapping.items
>
> Notice that nothing special is required, this "just works" as a natural
> consequence of the rest of the implementation shown.

This I believe I actually understand.

> Okay, so now you want to know how to *implement* a "Mapping".  Well,
> simplest but most tedious, you can just register operations directly, e.g.:
>
>       class MyMapping:
>           def __init__(self, data):
>               self.data = dict(data)
>           defop operator.getitem(self, key):
>               return self.data[key]
>           defop Mapping.items(self):
>               return self.data.items()

Looks like you're being inconsistent with your example above, which
has keys() and items() but not __getitem__(). This confuses me.

I'm also concerned that this looks to me like a big step back from
simply writing

class MyMapping:
  def __init__(self, data): self.data = dict(data)
  def __getitem__(self, key): return self.data[key]
  def items(self): return self.data.items()

and then (either inside the class or external to it) adding some
statement that claims that MyMapping implements Mapping.

> But as you can imagine, this would probably get a bit tedious if you're
> implementing lots of methods.  So, we can add metaclasses or class
> decorators here to say, "I implement these interfaces, so any methods I
> have whose names match the method names in the interfaces, please hook 'em
> up for me."  I'm going to leave out the implementation, as it should be a
> straightforward exercise for the reader to come up with many ways by which
> it can be accomplished.  The spelling might be something like:
>
>      class MyMapping:
>          implements(Mapping)
>
>          def items(self):
>              ...
>
>          #etc.

Sure, that's what I was after above. But how would you do this after
the fact, when you find that some 3rd party module already has the
right methods but didn't bother to add the correct implements()
clause?

> At which point, we have now come full circle to being able to provide all
> of the features of interfaces, adaptation, and generic functions, without
> forcing anyone to give up the tasty OO flavor of method calls.  Heck, they
> can even keep the way they spell existing adaptation calls (e.g. IFoo(bar)
> to adapt bar to IFoo) in PEAK, Twisted, and Zope!
>
> And finally, note that if you only want to perform one method call on a
> given object, you can also use the generics directly, e.g.
> Mapping.items(foo) instead of Mapping(foo).items().

But I want to write foo.items()!!!

> Voila -- generic goodness and classic OO method-calling simplicity, all in
> one simple to implement package.  It should now be apparent why I said that
> interfaces are trivial to implement if you define them as namespaces for
> generic functions, rather than as namespaces for methods.
>
> There are many spinoffs possible, too.  For example, you could have a
> factory function that turns an existing class's public operations into an
> interface object.  There are also probably also some dark corners of the
> idea that haven't been explored, because when I first proposed basically
> this idea in '05, nobody was ready for it.  Now maybe we can actually talk
> about the implications.

Well, okay, I have to admit that I only half "get" what that was all about.

But let me end on a more positive note. I find the idea very
attractive that many builtin operations (from len to iter to + to <=
to __getitem__) can be considered to be generic functions. In my head
I worked out how you could take the built-in set type and the
(ancient) set-of-integers implementation hiding in mhlib.py
(mhlib.IntSet), and add the various set-theoretical operations (union,
intersection, differences, and containment relationships) on mixed
operands without modifying the code of the IntSet class. This is
pretty neat. You could even add those operations for two IntSet
instances without modifying that class. I am envisioning that most of
these operations would have a default implementation that does what
those operations currently do; e.g. the generic function __add__ would
be something like

@generic
def __add__(a, b):  # default implementation
  if hasattr(a, "__add__"):
    r = a.__add__(b)
    if r is not NotImplemented: return r
  if hasattr(b, "__radd__"):
    r = b.__radd__(a)
    if r is not NotImplemented: return r
  raise TypeError(...)

I also like the idea that we can associate an "atomic" interface with
a generic function, so that an object for which that GF has an
implementation is automatically said to implement that interface. And
I like the idea that we can compose new interfaces out of such atomic
interfaces, such that testing for whether an object implements such a
composite interface is automatically turned into testing whether it
implements each of the atomic interfaces. And I also like being able
to take arbitrary subsets of composite interfaces as the logical
consequence of this.

(I don't think the name of the interface should be the same as the
name of the generic function. This has caused me enough confusion
already. I'd rather be able to say e.g. that there's an interface
"Sizeable" which means that the operation "len" is provided.)

I think there are still some important holes in this idea (at least in
my understanding). For example, if all generic functions have a
default implementation, then essentially all generic functions are
implemented for all objects, and all objects have all interfaces!
That's not right. Of course, the default implementation may well raise
TypeError for some arguments (like my __add__ example). But that's not
helpful in deciding whether an object has an interface -- invoking the
operation is way too heavy-handed. Maybe we should provide a separate
testing function. (Although that doesn't seem quite sufficient for the
__add__ example, since it really *does* need to execute the operation
in order to determine whether it is implemented. I'll have to come
back to this problem later.)

Another issue is GFs taking multiple arguments. At least for binary
operators (like __add__ above) it may make sense to say that a class C
implements the operator's interface (e.g. Addable in this case) if the
operator is defined for operands that are both C instances. Or maybe
that interface should be called SelfAddable, and we should reserve
Addable for anything that can occur on the lhs of a + operator? I
think we can probably come up with a reasonable rule for this; right
now I'm too tired to think of one.

But the biggest hole (to me) is being able to talk about regular
methods in an interface. I would like to be able to say that e.g. the
interface StandardMapping implements the operations getitem, len, iter
and contains, and the methods get, keys, values and items. I have no
problem with the operations, but I don't quite get how we can add the
methods to the interface. I understand that your explanation above
provides ways to talk about those, I just don't quite see the light
yet.

I would like to end up in a world where, in order to claim that a
class is a standard mapping class, the class definition would
(normally) explicitly claim to implement StandardMapping (using as yet
unspecified syntax) and that there would be a way (again I'm not yet
specifying syntax for this) to efficiently test any given class or
instance for that interface, returning True or False. The standard
dict implementation would claim to implement StandardMapping (and
probably also something that we could call StandardMutableMapping).
Someone could define an interface MinimalMapping as the interface
formed of StandardMapping.getitem, StandardMapping.contains and
StandardMapping.keys, and testing a dict for MinimalMapping would
return True. But if someone created ExtendedMapping as StandardMapping
plus the copy method and the eq operation, testing a dict for
ExtendedMapping would return False; however there should be something
one could execute to explicitly mark dict as implementing
ExtendedMapping, using some kind of registry.

Reiterating, I wonder if perhaps a major disconnect has to do with the
difference between "testing" for an interface vs. "casting" to an
interface. I don't like casting, since it smells like adaptation.

(There's also an extension of these ideas I'd like to explore where
full method signatures are possible in an interface, in an
introspectable way. This would probably require parameterizable types,
so that one could talk about StandardMapping[int, str] (a standard
mapping from ints to strings), and one could derive that the argument
to getitem is an int and its return value is a str, and so on; keys()
would return an Iterable of ints (or an IterableSet of ints, or
whatever), for example. But that's also for a later post.)

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list