Unicode strings as arguments to exceptions

Terry Reedy tjreedy at udel.edu
Thu Jan 16 18:48:00 EST 2014


On 1/16/2014 9:16 AM, Steven D'Aprano wrote:
> On Thu, 16 Jan 2014 13:34:08 +0100, Ernest Adrogué wrote:
>
>> Hi,
>>
>> There seems to be some inconsistency in the way exceptions handle
>> Unicode strings.
>
> Yes. I believe the problem lies in the __str__ method. For example,
> KeyError manages to handle Unicode, although in an ugly way:
>
> py> str(KeyError(u'ä'))
> "u'\\xe4'"
>
> Hence:
>
> py> raise KeyError(u'ä')
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> KeyError: u'\xe4'
>
>
> While ValueError assumes ASCII and fails:
>
> py> str(ValueError(u'ä'))
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in
> position 0: ordinal not in range(128)
>
> When displaying the traceback, the error is suppressed, hence:
>
> py> raise ValueError(u'ä')
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> ValueError
>
> I believe this might be accepted as a bug report on ValueError.

Or a change might be rejected as a feature change or as a bugfix that 
might break existing code. We do change exception messages in new 
versions but do not normally do so in bugfix releases.

http://bugs.python.org/issue1012952 is related but different. The issue 
there was that unicode(ValueError(u'ä')) gave the same 
UnicodeEncodeError as str(ValueError(u'ä')). That was fixed by giving 
exceptions a __unicode__ method, but that did not fix the traceback 
display issue above.

http://bugs.python.org/issue6108
unicode(exception) and str(exception) should return the same message
also seems related. The issue was raised what str should do if the 
unicode message had non-ascii chars. I did not read enough to find an 
answer. The same question would arise here.

-- 
Terry Jan Reedy





More information about the Python-list mailing list