[Python-Dev] unicode and __str__

Neil Schemenauer nas at arctrix.com
Mon Aug 30 22:18:16 CEST 2004


With Python 2.4:

    >>> u = u'\N{WHITE SMILING FACE}'
    >>> class A:
    ...   def __str__(self):
    ...     return u
    ... 
    >>> class B:
    ...   def __unicode__(self):
    ...     return u
    ... 
    >>> u'%s' % A()
    u'\u263a'
    >>> u'%s' % B()
    u'\u263a'

With Python 2.3:

    >>> u'%s' % A()
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    UnicodeEncodeError: 'ascii' codec can't encode character u'\u263a' in
        position 0: ordinal not in range(128)
    >>> u'%s' % B()
    u'<__main__.B instance at 0x401f910c>'

The only thing I found in the NEWS file that seemed relevant is
this note:

  u'%s' % obj will now try obj.__unicode__() first and fallback to
  obj.__str__() if no __unicode__ method can be found.

I don't think that describes the behavior difference.  Allowing
__str__ return unicode strings seems like a pretty noteworthy
change (assuming that's what actually happened).

Also, I'm a little unclear on the purpose of the __unicode__ method.
If you can return unicode from __str__ then why would I want to
provide a __unicode__ method?  Perhaps it is meant for objects that
can either return a unicode or a string representation depending on
what the caller prefers.  I have a hard time imagining a use for
that.

  Neil


More information about the Python-Dev mailing list