[Python-Dev] __str__ vs. __unicode__

Walter Dörwald walter at livinglogic.de
Wed Jan 19 12:19:14 CET 2005


Bob Ippolito wrote:
> On Jan 19, 2005, at 4:40, Walter Dörwald wrote:
> 
>> [...]
>> That's cheating! ;)
>>
>> My use case is an XML DOM API: __unicode__() should extract the
>> character data from the DOM. For Text nodes this is the text,
>> for comments and processing instructions this is u"" etc. To
>> reduce memory footprint and to inherit all the unicode methods,
>> it would be good if Text, Comment and ProcessingInstruction could
>> be subclasses of unicode.
> 
> It sounds like a really bad idea to have a class that supports both of 
> these properties:
> - unicode as a base class
> - non-trivial result from unicode(foo)
> 
> Do you REALLY think this should be True?!
>     isinstance(foo, unicode) and foo != unicode(foo)
> 
> Why don't you just call this "extract character data" method something 
> other than __unicode__?

IMHO __unicode__ is the most natural and logical choice.
isinstance(foo, unicode) is just an implementation detail.

But you're right: the consequences of this can be a bit scary.

> That way, you get the reduced memory footprint 
> and convenience methods of unicode, with none of the craziness.

Without this craziness we wouldn't have discovered the problem. ;)
Whether this craziness gets implemented, depends on the solution
to this problem.

Bye,
    Walter Dörwald


More information about the Python-Dev mailing list