Subclassing str object

Ian Kelly ian.g.kelly at gmail.com
Wed Aug 31 18:16:54 EDT 2011


2011/8/31 Yaşar Arabacı <yasar11732 at gmail.com>:
> @Ian: Thanks for you comments. I indeed didn't need the _sozcuk attribute at
> all, so I deleted it. My class's addition and multiplication works without
> overwriting __add__ and __mul__ because, this class uses unicode's __add__
> and __mul__ than creates a new kelime instance with return value of those
> methods in __getattribute__.

I think if you try it, you'll find that the result is an ordinary
unicode object, not a kelime instance, because __getattribute__ is
*not* invoked when Python looks up special method names on the class
object.

> I didn't get a good grasp on how using basestring there might broke
> encoding, could you explain a little bit more, or provide a reading
> material?

The unicode.encode method takes a unicode object and encodes it into a
byte string (a str object).  If you then wrap that up in a kelime
object, which is a unicode subclass, it has to decode the string back
to unicode (using the default ascii codec, since it isn't specified).
Thus the result of the call is no longer an encoded byte string as
would be expected.  If you're lucky, you'll get a UnicodeDecodeError
since it's just using the ascii codec.  If you're unlucky, it will
silently return a result of the wrong type.

> So the thing I wonder, when creating new instance in for example
> capitalize() method, does str use something like self.__new__() or
> unicode.__new__()? Because, in latter case, I could override the __new__
> method on my class, so that every method would create my class's instance,
> instead of unicode's

No, that doesn't work.



More information about the Python-list mailing list