[Tutor] class decorator question

Sun Oct 6 18:28:52 CEST 2013

----- Original Message -----

> From: Steven D'Aprano <steve at pearwood.info>
> To: tutor at python.org
> Cc: 
> Sent: Sunday, October 6, 2013 4:52 AM
> Subject: Re: [Tutor] class decorator question
> 
> On Sat, Oct 05, 2013 at 12:26:14PM -0700, Albert-Jan Roskam wrote:
> 
>>  >> On http://lucumr.pocoo.org/2013/5/21/porting-to-python-3-redux/ I 
> saw 
>>  >> a very cool and useful example of a class decorator. It 
> (re)implements 
>>  >> __str__ and __unicode__ in case Python 2 is used. For Python 3, 
> the 
>>  >> decorator does nothing. I wanted to generalize this decorator so 
> the 
>>  >> __str__ method under Python 2 encodes the string to an arbitrary 
>>  >> encoding. This is what I've created: 
> http://pastebin.com/vghD1bVJ.
>>  >> 
>>  >> It works, but the code is not very easy to understand, I am 
> affraid. 
>>  >
>>  >It's easy to understand, it's just doing it the wrong way. It 
> creates 
>>  >and subclass of your class, which it shouldn't do. 
>> 
>>  Why not? Because it's an unusual coding pattern? Or is it ineffecient?
> 
> It is both of those things. (Well, the inefficiency is minor.) My 
> main objection is that it is inelegant, like using a screwdriver as 
> a chisel instead of using a chisel -- even when it's "good 
> enough", 
> it's not something you want other people to see you doing if you 
> care about looking like a craftsman :-)

or use a shoe to hammer a nail in the wall... ;-)

> Another issue is to do with naming. In your example, you decorate Test. 
> What that means in practice is that you create a new class, Klass(Test), 
> throw away Test, and bind Klass to the top-level name Test. So in effect 
> you're doing this:
> 
> class Test # The undecorated version.
> 
> class Klass(Test)  # Subclass it inside the decorator.
> 
> Test = Klass  # throw away the original and re-use the variable name.
> 
> But classes, like functions, have *two* names. They have the name they 
> are bound to, the variable name (*usually* one of these, but sometimes 
> zero or two or more). And they have their own internal name:
> 
> Test.__name__
> => returns "Klass"
> 
> 
> This will make debugging unneccesarily confusing. If you use your 
> decorator three times:
> 
> @implements_to_string
> class Spam
> 
> @implements_to_string
> class Eggs
> 
> @implements_to_string
> class Cheese
> 
> 
> instances of all three of Spam, Eggs and Cheese will claim to be 
> instances of "Klass".

That would indeed be *very* confusing. 

> Now there is a simple work-around for this: inside the decorator, call
> 
> Klass.__name__ = cls.__name__ 
> 
> before returning. But that leads to another issue, where instances of 
> the parent, undecorated, class (if any!) and instances of the child, 
> decorated, class both claim to be from the same "Test" class. This is 
> more of theoretical concern, since you're unlikely to be instantiating 
> the undecorated parent class.
> 
> 
>>  I subclassed because I needed the encoding value in the decorator.
>>  But subclassing may indeed have been overkill.
> 
> Yes :-)
> 
> The encoding value isn't actually defined until long after the decorator 
> has finished doing its work, after the class is decorated, and an 
> instance is defined. So there is no encoding value used in the decorator 
> itself. The decorator can trivially refer to the encoding value, so long 
> as that doesn't actually get executed until after an instance is 
> created:
> 
> def decorate(cls):
>     def spam(self):
>         print(self.encoding)
>     cls.spam = spam
>     return cls
> 
> works fine without subclassing.

waah, why didn't I think of this? I've been making this way more complicated than needed. self.__dict__["encoding"] = self.encoding (see also below) was another way I considered to pass the encoding value from the class to its decorator. I even considered making a class decorator with arguments. All unnecesary. 

> 
>>  >Here's a better 
>>  >approach: inject the appropriate methods into the class directly. 
> Here's 
>>  >a version for Python 3:
> [...]
>>  >This avoids overwriting __str__ if it is already defined, and likewise 
>>  >for __bytes__.
>> 
>>  Doesn't a class always have __str__ implementation?
> 
> No. Where is the __str__ implementation here?
> 
> class X:
>     pass
> 
> This class defines no methods at all. Its *superclass*, object in Python 
> 3, defines methods such as __str__. But you'll notice that I didn't call 
> 
> 
>     hasattr(cls, '__str__') 
> 
> since that will return True, due to object having a __str__ method. I 
> called
> 
>     '__str__' in cls.__dict__
> 
> which only returns True if cls explicitly defines a __str__ method.

aaaaaahh, yes, of course these are not the same so 'method_name' in cls.__dict__ tests whether method_name is *implemented* in that class. In many/most cases hasattr is all you need because you want to know whether method_name can be *called* in that class.

>>  Nice, thanks Steven. I made a couple of versions after reading your 
>>  advise. The main change that I still had to somehow retrieve the 
>>  encoding value from the class to be decorated (decoratee?). I simply 
>>  stored it in __dict__. Here is the second version that I created: 
>>  http://pastebin.com/te3Ap50C. I tested it in Python 2 and 3. 
> 
> Not sufficiently :-) Your test class has problems. See below.
> 
> 
> 
>>  The Test 
>>  class contains __str__ and __unicode__ which are renamed and redefined 
>>  by the decorator if Python 3 (or 4, or..) is used.
>> 
>> 
>>  General question: I am using pastebin now. Is that okay, given that 
>>  this is not part of the "memory" of the Python Tutor archive? It 
> might 
>>  be annoying if people search the archives and get 404s if they try to 
>>  follow these links. Just in case I am also pasting the code below:
> 
> In my opinion, no it's not okay, particularly if your code is short 
> enough to be posted here.
> 
> Just because a pserson has access to this mailing list doesn't 
> necessarily mean they have access to pastebin. It might be blocked. The 
> site might be down. They might object to websites that require 
> Javascript (pastebin doesn't *require* it, but it's only a matter of 
> time...). Or they may simply be too busy/lazy to follow the link.

It's also easy to do both. I always hope code in mails does not get mangled (even if it's plain text). The colour coding of pastebin and similar sites helps other readers understand code more easily. And I agree posting long code is a no-no. 

>>  from __future__ import print_function
>>  import sys
>>      
>>  def decorate(cls):
>>      print("decorate called")
>>      if sys.version_info[0] > 2:
>>          cls.__dict__["__str__"].__name__ = '__bytes__'
>>          cls.__dict__["__unicode__"].__name__ = '__str__'
>>          cls.__bytes__ = cls.__dict__["__str__"]
>>          cls.__str__ = cls.__dict__["__unicode__"]  
>>      return cls
> 
> I thought your aim was to write something that was cross-version and 
> that added default __str__ and __unicode__ methods to the class if they 
> didn't already exist? [looks back at the original code...] Ah no, my 
> mistake, I misunderstood.
> 
> The above requires the caller to write their classes using the Python 2 
> style __str__ and __unicode__ methods. __unicode__ isn't even mandatory 
> in Python 2, but your decorate won't work without it!
> 
> As given, your decorator:
> - does nothing in Python 2, even if the caller didn't define __str__ 
>   or __unicode__ methods;.

I *know* that I defined three classes that each contain __str__ and __unicode__, so is it still a good idea to test for their existence?
So a meta question: How generally applicable should code, in this case a 
decorator, be? Should one always strive for code that could readily be 
re-used in other places? It is cool (and efficient, and intellectually 
gratifying) if code can be re-used, but isn't a downside that the code 
is more sophisticated/longer than required for a given context? At what point does refined code turn into "bloated software"? 
http://c2.com/cgi/wiki?PrematureGeneralization . 

> - fails in Python 3 if the class doesn't define a  __unicode__ method;
> 
> - does the wrong thing in Python 3 if the class already has correctly 
>   working __str__ and __bytes__ methods;
> 
> - doesn't help you if you have a Python 3 style class and want to use
>   it in Python 2;

Python 3 style class is a class that inherits from object, right (class Foo(object):...)? I indeed had not considered the possibility that the decorator might fail when used for old-style classes. 

> - doesn't work well if the decorated class inherits its __str__ and 
>   __unicode__ methods from a parent class.
> 
> 
> Admittedly, that last one is tricky, thanks to everything inheriting 
> from object.
> 
> 
>>  @decorate
>>  class Test(object):
>> 
>>      def __init__(self):
>>          self.__dict__["encoding"] = self.encoding
> 
> Why are you doing that? What is the outcome you are hoping for, and why 
> do you think it is necessary?

See also above. I should have deleted that.

>>      def __str__(self):
>>          return "str called".encode(self.encoding)
>> 
>>      def __unicode__(self):
>>          return "unicode called"
> 
> These are wrong! Worse, you have multiple errors that cancel each 
> other out -- sometimes, two wrongs do make a right.

aargh, of course. I should have done (me thinks):

    def __str__(self):
        return  self.__unicode__().encode(self.encoding)

    def __unicode__(self):
        return u"unicode called"

> In Python 2: calling encode on a byte-string is permitted, but is the 
> wrong thing to do. By accident, it (usually?) works, but you shouldn't 
> do it. So there's your first wrong.
> 
> When converted to Python 3, the __str__ method becomes __bytes__, and is 
> supposed to return bytes. Now the "str called" literal is Unicode, and 
> 
> encode will work, returning bytes. But it only works because of the 
> first wrong -- if you re-write __str__ to use b"str called", or to 
> call 
> "str called".decode, your Python 3 __bytes__ method will fail.
> 
> In Python 2, __unicode__ ought to return a unicode string, u"unicode 
> called". By accident, if you return a byte string, Python will decode it 
> using ASCII, and it seems to work. But it's still wrong, and it's 
> particularly likely to go wrong if the __unicode__ method does any, 
> well, Unicode stuff. 
> 
> When converted to __str__ by the decorator, the ex-__unicode__ method 
> will work, but only because you used a (Python2) byte-string literal 
> "..." inside it. If you wrote a u"Unicode string", it would 
> fail in 
> Python 3.1 or 3.2 (but work in 3.3 and better).
> 
> 
>>      @property
>>      def encoding(self):
>>          """In reality this method extracts the encoding from 
> a file"""
>>          return "utf-8" # rot13 no longer exists in Python3
> 
> Why would you do that?
> 
> Why not just supply the encoding when you initialise the instance?

Counter question: why would I ask the caller for information if that information can automatically be retrieved?

>     def __init__(self, encoding):
>         self.encoding = encoding
> 
> 
>>  if __name__ == "__main__":
>>      t = Test()
>>      if sys.version_info[0] == 2:
>>          print(unicode(t))
>>      print(str(t))
> 
> This is insufficient testing. In Python 2, you need to test both 
> unicode(t) and str(t). In Python 3, you need to test both str(t) and 
> bytes(t).
> 
> In may turn out that, by accident, all four tests work for the given 
> Test class. But that's not going to apply to everything.
> 
> 
> 
> 
> -- 
> Steven
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>