unicode(obj, errors='foo') raises TypeError - bug?

Steven Bethard steven.bethard at gmail.com
Wed Feb 23 02:28:09 EST 2005


Mike Brown wrote:
>>>>class C:
> ...   def __str__(self):
> ...      return 'asdf\xff'
> ...
>>>>o = C()
>>>>unicode(o, errors='replace')
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: coercing to Unicode: need string or buffer, instance found
>
[snip]
> 
> What am I doing wrong? Is this a bug in Python?

No, this is documented behavior[1]:

"""
unicode([object[, encoding [, errors]]])
     ...
     For objects which provide a __unicode__() method, it will call this 
method without arguments to create a Unicode string. For all other 
objects, the 8-bit string version or representation is requested and 
then converted to a Unicode string using the codec for the default 
encoding in 'strict' mode.
"""

Note that the documentation basically says that it will call str() on 
your object, and then convert it in 'strict' mode.  You should either 
define __unicode__ or call str() manually on the object.

STeVe

[1] http://docs.python.org/lib/built-in-funcs.html



More information about the Python-list mailing list