[Python-Dev] __str__ vs. __unicode__

Bob Ippolito bob at redivi.com
Wed Jan 19 11:10:36 CET 2005


On Jan 19, 2005, at 4:40, Walter Dörwald wrote:

> M.-A. Lemburg wrote:
>
>> Walter Dörwald wrote:
>>> __str__ and __unicode__ seem to behave differently. A __str__
>>> overwrite in a str subclass is used when calling str(), a __unicode__
>>> overwrite in a unicode subclass is *not* used when calling unicode():
>>>
>>> [...]
>> If you drop the base class for unicode, this already works.
>
> That's cheating! ;)
>
> My use case is an XML DOM API: __unicode__() should extract the
> character data from the DOM. For Text nodes this is the text,
> for comments and processing instructions this is u"" etc. To
> reduce memory footprint and to inherit all the unicode methods,
> it would be good if Text, Comment and ProcessingInstruction could
> be subclasses of unicode.

It sounds like a really bad idea to have a class that supports both of 
these properties:
- unicode as a base class
- non-trivial result from unicode(foo)

Do you REALLY think this should be True?!
     isinstance(foo, unicode) and foo != unicode(foo)

Why don't you just call this "extract character data" method something 
other than __unicode__?  That way, you get the reduced memory footprint 
and convenience methods of unicode, with none of the craziness.

-bob



More information about the Python-Dev mailing list