[Python-Dev] PEP 246, redux

Mon Jan 10 22:38:55 CET 2005

At 07:42 PM 1/10/05 +0100, Alex Martelli wrote:

>On 2005 Jan 10, at 18:43, Phillip J. Eby wrote:
>    ...
>>At 03:42 PM 1/10/05 +0100, Alex Martelli wrote:
>>>     The fourth case above is subtle.  A break of substitutability can
>>>     occur when a subclass changes a method's signature, or restricts
>>>     the domains accepted for a method's argument ("co-variance" on
>>>     arguments types), or extends the co-domain to include return
>>>     values which the base class may never produce ("contra-variance"
>>>     on return types).  While compliance based on class inheritance
>>>     _should_ be automatic, this proposal allows an object to signal
>>>     that it is not compliant with a base class protocol.
>>
>>-1 if this introduces a performance penalty to a wide range of 
>>adaptations (i.e. those using abstract base classes), just to support 
>>people who want to create deliberate Liskov violations.  I personally 
>>don't think that we should pander to Liskov violators, especially since 
>>Guido seems to be saying that there will be some kind of interface 
>>objects available in future Pythons.
>
>If interfaces can ensure against Liskov violations in instances of their 
>subclasses, then they can follow the "case (a)" fast path, sure.
>Inheriting from an interface (in Guido's current proposal, as per his 
>Artima blog) is a serious commitment from the inheritor's part; inheriting 
>from an ordinary type, in real-world current practice, need not be -- too 
>many cases of assumed covariance, for example, are around in the wild, to 
>leave NO recourse in such cases and just assume compliance.

I understand that, sure.  But I don't understand why we should add 
complexity to PEP 246 to support not one but *two* bad practices: 1) 
implementing Liskov violations and 2) adapting to concrete classes.  It is 
only if you are doing *both* of these that this extra feature is needed.

If it were to support some kind of backward compatibility, that would be 
understandable.  However, in practice, I don't know of anybody using 
adapt(x,ConcreteClass), and even if they did, the person subclassing 
ConcreteClass will need to change their subclass to raise LiskovViolation, 
so why not just switch to delegation?

Anyway, it seems to me a bad idea to add complexity to support this 
case.  Do you have a more specific example of a situation in which a Liskov 
violation coupled to concrete class adaptation is a good idea?  Or am I 
missing something here?

>>I am not saying we shouldn't have a tp_conform; just suggesting that it 
>>may be appropriate for functions and modules (as well as classic classes) 
>>to have their tp_conform delegate back to self.__dict__['__conform__'] 
>>instead of a null implementation.
>
>I have not considered conformance of such objects as functions or modules; 
>if that is important,

It's used in at least Zope and PEAK; I don't know if it's in use in Twisted.

>  I need to add it to the reference implementation in the PEP.  I'm 
> reluctant to just get __conform__ from the object, though; it leads to 
> all sort of issues with a *class* conforming vs its *instances*, 
> etc.  Maybe Guido can Pronounce a little on this sub-issue...

Actually, if you looked at the field-tested implementations of the old PEP 
246, they actually have code that deals with this issue effectively, by 
recognizing TypeError when raised by attempting to invoke __adapt__ or 
__conform__ with the wrong number of arguments or argument types.  (The 
traceback for such errors does not include a frame for the called method, 
versus a TypeError raised *within* the function, which does have such a 
frame.  AFAIK, this technique should be compatible with any Python 
implementation that has traceback objects and does signature validation in 
its "native" code rather than in a new Python frame.)

>>I don't see the benefit of LiskovViolation, or of doing the exact type 
>>check vs. the loose check.  What is the use case for these?  Is it to 
>>allow subclasses to say, "Hey I'm not my superclass?"  It's also a bit 
>>confusing to say that if the routines "raise any other exceptions" 
>>they're propagated.  Are you saying that LiskovViolation is *not* propagated?
>
>Indeed I am -- I thought that was very clearly expressed!

The PEP just said that it would be raised by __conform__ or __adapt__, not 
that it would be caught by adapt() or that it would be used to control the 
behavior in that way.  Re-reading, I see that you do mention it much 
farther down.  But at the point where __conform__ and __adapt__ are 
explained, it has not been explained that adapt() should catch the error or 
do anything special with it.  It is simply implied by the "to prevent this 
default behavior" at the end of the section.  If this approach is accepted, 
the description should be made explicit, becausse for me at least it 
required a retroactive re-interpretation of the earlier part of the spec.

>The previous version treated TypeError specially, but I think (on the 
>basis of just playing around a bit, admittedly) that offers no real added 
>value and sometimes will hide bugs.

See http://peak.telecommunity.com/protocol_ref/node9.html for an analysis 
of the old PEP 246 TypeError behavior, and the changes made the by 
PyProtocols and Zope to deal with the situation better, while still 
respecting the fact that __conform__ and __adapt__ may be retrieved from 
the wrong "meta level" of descriptor.

Your new proposal does not actually fix this problem in the absence of 
tp_conform/tp_adapt slots; it merely substitutes possible confusion at the 
metaclass/class level for confusion at the class/instance level.  The only 
way to actually fix this is to detect when you have called the wrong level, 
and that is what the PyProtocols and Zope implementations of "old PEP 246" 
do.  (PyProtocols also introduces a special descriptor for methods defined 
on metaclasses, to help avoid creating this possible confusion in the first 
place, but that is a separate issue.)

>>>     If none of the first four mechanisms worked, as a last-ditch
>>>     attempt, 'adapt' falls back to checking a registry of adapter
>>>     factories, indexed by the protocol and the type of `obj', to meet
>>>     the fifth case.  Adapter factories may be dynamically registered
>>>     and removed from that registry to provide "third party adaptation"
>>>     of objects and protocols that have no knowledge of each other, in
>>>     a way that is not invasive to either the object or the protocols.
>>
>>This should either be fleshed out to a concrete proposal, or dropped.
>>There are many details that would need to be answered, such as whether 
>>"type" includes subtypes and whether it really means type or 
>>__class__.  (Note that isinstance() now uses __class__, allowing proxy 
>>objects to lie about their class; the adaptation system should support 
>>this too, and both the Zope and PyProtocols interface systems and 
>>PyProtocols' generic functions support it.)
>
>I disagree: I think the strawman-level proposal as fleshed out in the 
>pep's reference implementation is far better than nothing.

I'm not proposing to flesh out the functionality, just the specification; 
it should not be necessary to read the reference implementation and try to 
infer intent from it.  What part is implementation accident, and what is 
supposed to be the specification?  That's all I'm talking about here.  As 
currently written, the proposal is just, "we should have a registry", and 
is not precise enough to allow someone to implement it based strictly on 
the specification.

>   I mention the issue of subtypes explicitly later, including why the pep 
> does NOT do anything special with them -- the reference implementation 
> deals with specific types.  And I use type(X) consistently, explicitly 
> mentioning in the reference implementation that old-style classes are not 
> covered.

As a practical matter, classic classes exist and are useful, and PEP 246 
implementations already exist that work with them.  Dropping that 
functionality is a major step backward for PEP 246, IMO.

>I didn't know about the "let the object lie" quirk in isinstance.  If that 
>quirk is indeed an intended design feature,

It is; it's in one of the "what's new" feature highlights for either 2.3 or 
2.4, I forget which.  It was intended to allow proxy objects (like security 
proxies in Zope 3) to pretend to be an instance of the class they are proxying.

>>One other issue: it's not possible to have standalone interoperable PEP 
>>246 implementations using a registry, unless there's a standardized place 
>>to put it, and a specification for how it gets there.  Otherwise, if 
>>someone is using both say Zope and PEAK in the same application, they 
>>would have to take care to register adaptations in both places.  This is 
>>actually a pretty minor issue since in practice both frameworks' 
>>interfaces handle adaptation, so there is no *need* for this extra 
>>registry in such cases.
>
>I'm not sure I understand this issue, so I'm sure glad it's "pretty minor".

All I was saying is that if you have two 'adapt()' implementations, each 
using its own registry, you have a possible interoperability problem.  Two 
'adapt()' implementations that conform strictly to PEP 246 *without* a 
registry are interoperable because their behavior is the same.

As a practical matter, all this means is that standalone PEP 246 
implementations for older versions of Python either shouldn't implement a 
registry, or they need to have a standard place to put it that they can 
share with each other, and it needs to be implemented the same way.  (This 
is one reason I think the registry specification should be more formal; it 
may be necessary for existing PEP 246 implementations to be 
forward-compatible with the spec as implemented in later Python versions.)

>>The issue isn't that adaptation isn't casting; why would casting a string 
>>to a file mean that you should open that filename?
>
>Because, in most contexts, "casting" object X to type Y means calling Y(X).

Ah; I had not seen that called "casting" in Python, at least not to my 
immediate recollection.  However, if that is what you mean, then why not 
say it?  :)

>Maybe we're using different definitions of "casting"?

I'm most accustomed to the C and Java definitions of casting, so that's 
probably why I can't see how it relates at all.  :)

>>If I were going to say anything about that case, I'd say that adaptation 
>>should not be "lossy"; adapting from a designator to a file loses 
>>information like what mode the file should be opened in.
>>(Similarly, I don't see adapting from float to int; if you want a cast to 
>>int, cast it.)  Or to put it another way, adaptability should imply 
>>substitutability: a string may be used as a filename, a filename may be 
>>used to designate a file.  But a filename cannot be used as a file; that 
>>makes no sense.
>
>I don't understand this "other way" -- nor, to be honest, what you "would 
>say" earlier, either.  I think it's pretty normal for adaptation to be 
>"lossy" -- to rely on some but not all of the information in the original 
>object: that's the "facade" design pattern, after all.  It doesn't mean 
>that some info in the original object is lost forever, since the original 
>object need not be altered; it just means that not ALL of the info that's 
>in the original object used in the adapter -- and, what's wrong with that?!

I think we're using different definitions of "lossy", too.  I mean that 
defining an adaptation relationship between two types when there is more 
than one "sensible" way to get from one to the other is "lossy" of 
semantics/user choice.  If I have a file designator (such as a filename), I 
can choose how to open it.  If I adapt directly from string to file by way 
of filename, I lose this choice (it is "lossy" adaptation).

Here's a better way of phrasing it (I hope): adaptation should be 
unambiguous.  There should only be one sensible way to interpret a thing as 
implementing a particular interface, otherwise, adaptation itself has no 
meaning.  Whether an adaptation adds or subtracts behavior, it does not 
really change the underlying *intended* meaning of a thing, or else it is 
not really adaptation.  Adapting 12.0 to 12 does not change the meaning of 
the value, but adapting from 12.1 to 12 does.

Does that make more sense?  I think that some people start using adaptation 
and want to use it for all kinds of crazy things because it seems 
cool.  However, it takes a while to see that adaptation is just about 
removing unnecessary accidents-of-incompatibility; it's not a license to 
transform arbitrary things into arbitrary things.  There has to be some 
*meaning* to a particular adaptation, or the whole concept rapidly 
degenerates into an undifferentiated mess.

(Or else, you decide to "fix" it by disallowing transitive adaptation, 
which IMO is like cutting off your hand because it hurts when you punch a 
brick wall.  Stop punching brick walls (i.e. using semantic-lossy 
adaptations), and the problem goes away.  But I realize that I'm in the 
minority here with regards to this opinion.)

>For example, say that I have some immutable "record" types.  One, type 
>Person, defined in some framework X, has a huge lot of immutable data 
>fields, including firstName, middleName, lastName, and many, many 
>others.  Another, type Employee, defines in some separate framework Y 
>(that has no knowlege of X, and viceversa), has fewer data fields, and in 
>particular one called 'fullName' which is supposed to be a string such as 
>'Firstname M. Lastname'.  I would like to register an adapter factory from 
>type Person to protocol Employeee.  Since we said Person has many more 
>data fields, adaptation will be "lossy" -- it will look upon Employee 
>essentially as a "facade" (a simplified-interface) for Person.

But it doesn't change the *meaning*.  I realize that "meaning" is not an 
easy concept to pin down into a nice formal definition.  I'm just saying 
that adaptation is about semantics-preserving transformations, otherwise 
you could just tack an arbitrary object on to something and call it an 
adapter.  Adapters should be about exposing an object's *existing 
semantics* in terms of a different interface, whether the interface is a 
subset or superset of the original object's interface.  However, they 
should not add or remove arbitrary semantics that are not part of the 
difference in interfaces.

For example, adding a "current position" to a string to get a StringIO is a 
difference that is reflected in the difference in interface: a StringIO 
*is* just a string of characters with a current position that can be used 
in place of slicing.

But interpreting a string as a *file* doesn't make sense because of added 
semantics that have to be "made up", and are not merely interpreting the 
string's semantics "as a" file.  I suppose you could say that this is 
"noisy" adaptation rather than "lossy".  That is, to treat a string as a 
file by using it as a filename, you have to make up things that aren't 
present in the string.  (Versus the StringIO, where there's a sensible 
interpretation of a string "as a" StringIO.)

IOW, adaptation is all about "as a" relationships from concrete objects to 
abstract roles, and between abstract roles.  Although one may colloquially 
speak of using a screwdriver "as a" hammer, this is not the case in 
adaptation.  One may use a screwdriver "as a" pounder-of-nails.  The 
difference is that a hammer might also be usable "as a" 
remover-of-nails.  Therefore, there is no general "as a" relationship 
between pounder-of-nails and remover-of-nails, even though a hammer is 
usable "as" either one.  Thus, it does not make sense to say that a 
screwdriver is usable "as a" hammer, because this would imply it's also 
usable to remove nails.

This is why I don't believe it makes sense in the general case to adapt to 
concrete classes; such classes usually have many roles where they are 
usable.  I think the main difference in your position and mine is that I 
think one should adapt primarily to interfaces, and interface-to-interface 
adaptation should be reserved for non-lossy, non-noisy adapters.  Where if 
I understand the opposing position correctly, it is instead that one should 
avoid transitivity so that loss and noise do not accumulate too badly.

>So, can you please explain your objections to what I said about adapting 
>vs casting in terms of this example?  Do you think the example, or some 
>variation thereof, should go in the PEP?

I'm not sure I see how that helps.  I think it might be more useful to say 
that adaptation is not *conversion*, which is not the same thing (IME) as 
casting.  Casting in C and Java does not actually "convert" anything; it 
simply treats a value or object as if it were of a different type.  ISTM 
that bringing casting into the terminology just complicates the picture, 
because e.g. casting in Java actually corresponds to the subset of PEP 246 
adaptation for cases where adapt() returns the original object or raises an 
error.  (That is, if adapt() could only ever return the original object or 
raise an error, it would be precisely equivalent to Java casting, if I 
understand it correctly.)  Thus, at least with regard to object casting in 
Java, adaptation is a superset, and saying that it's not casting is just 
confusing.

>>>Reference Implementation and Test Cases
>>>
>>>     The following reference implementation does not deal with classic
>>>     classes: it consider only new-style classes.  If classic classes
>>>     need to be supported, the additions should be pretty clear, though
>>>     a bit messy (x.__class__ vs type(x), getting boundmethods directly
>>>     from the object rather than from the type, and so on).
>>
>>Please base a reference implementation off of either Zope or PyProtocols' 
>>field-tested implementations which deal correctly with __class__ vs. 
>>type(), and can detect whether they're calling a __conform__ or __adapt__ 
>>at the wrong metaclass level, etc.  Then, if there is a reasonable use 
>>case for LiskovViolation and the new type checking rules that justifies 
>>adding them, let's do so.
>
>I think that if a PEP includes a reference implementation, it should be 
>self-contained rather than require some other huge package.  If you can 
>critique specific problems in the reference implementation, I'll be very 
>grateful and eager to correct them.

Sure, I've got some above (e.g. your implementation will raise a spurious 
TypeError if it calls an __adapt__ or __conform__ at the wrong metaclass 
level, and getting them from the type does *not* fix this issue, it just 
bumps it up by one metalevel).  I wasn't proposing you pull in either whole 
package, though; just adapt() itself.  Here's the existing Python one from 
PyProtocols (there's also a more low-level one using the Python/C API, but 
it's probably not appropriate for the spec):

from sys import exc_info
from types import ClassType
ClassTypes = type, ClassType

def adapt(obj, protocol, default=_marker):

     """PEP 246-alike: Adapt 'obj' to 'protocol', return 'default'

     If 'default' is not supplied and no implementation is found,
     the result of 'factory(obj,protocol)' is returned.  If 'factory'
     is also not supplied, 'NotImplementedError' is then raised."""

     if isinstance(protocol,ClassTypes) and isinstance(obj,protocol):
         return obj

     try:
         _conform = obj.__conform__
     except AttributeError:
         pass
     else:
         try:
             result = _conform(protocol)
             if result is not None:
                 return result
         except TypeError:
             if exc_info()[2].tb_next is not None:
                 raise
     try:
         _adapt = protocol.__adapt__
     except AttributeError:
         pass
     else:
         try:
             result = _adapt(obj)
             if result is not None:
                 return result
         except TypeError:
             if exc_info()[2].tb_next is not None:
                 raise

     if default is _marker:
         raise AdaptationFailure("Can't adapt", obj, protocol)

     return default

Obviously, some changes would need to be made to implement your newly 
proposed functionality, but this one does support classic classes, modules, 
and functions, and it has neither the TypeError-hiding problem of the 
original PEP 246 nor the TypeError-raising problem of your new version.

>>>     Transitivity of adaptation is in fact somewhat controversial, as
>>>     is the relationship (if any) between adaptation and inheritance.
>>
>>The issue is simply this: what is substitutability?  If you say that 
>>interface B is substitutable for A, and C is substitutable for B, then C 
>>*must* be substitutable for A, or we have inadequately defined 
>>"substitutability".
>
>Not necessarily, depending on the pragmatics involved.

In that case, I generally prefer to be explicit and use conversion rather 
than using adaptation.  For example, if I really mean to truncate the 
fractional part of a number, I believe it's then appropriate to use 
'int(someNumber)' and make it clear that I'm intentionally using a lossy 
conversion rather than simply treating a number "as an" integer without 
changing its meaning.

>Thanks, BTW, for your highly detailed feedback.

No problem; talking this out helps me clarify my own thoughts on these 
matters.  I haven't had much occasion to clarify these matters, and when 
they come up, it's usually in the context of arguing some specific 
inappropriate use of adaptation, so I can easily present an alternative 
that makes sense in that context.  This discussion is helping me clarify 
the general principle, since I have to try to argue the general case, not 
just N specific cases.  :)