error messages containing unicode

Steven D'Aprano steve at REMOVEME.cybersource.com.au
Wed Jan 31 01:39:56 EST 2007


On Tue, 30 Jan 2007 04:34:24 -0800, Jim wrote:

> Thank you for the reply.  It happens that, as I understand it, none of 
> the options that you mentioned is a solution for my situation.
> 
> On Jan 29, 9:48 pm, Steven D'Aprano <s... at REMOVEME.cybersource.com.au> 
> wrote:
>> The easiest ways to fix that are:
>>
>> (1) subclass an exception that already knows about Unicode;
>
> But I often raise one of Python's built-in errors.  And also, is it 
> really true that subclassing one of Python's built-ins give me 
> something that is unicode deficient?  I assumed that I had missed 
> something (because that's happened so many times before :-) ).

If the built-in isn't Unicode aware, subclassing it won't magically make
it so :-)


> For instance, I write a lot of CGI and I want to wrap everything in a 
> try .. except.
>   try:
>       main()
>   except Exception, err:
>       print "Terrible blunder: ",str(err)
> so that the err can be one of my exceptions, or can be one that came 
> with Python.
> (And, that I can see, err.args can be either the relevant 
> string or a tuple containing the relevant string and the documentation 
> is silent on whether in the built-in exceptions if err.args is a tuple 
> then the string is guaranteed to be first in the tuple.)

Does it matter? Just print the tuple.


>> (2) convert the file name to ASCII before you store it; or
>
> I need the non-ascii information, though, which is why I included it 
> in the error message.

If you have the exception captured in "err", then you can grab it with
err.where_i_put_the_unicode.


>> (3) add a __str__ method to your exception that is Unicode aware.
>
> I have two difficulties with this: (1) as above I often raise Python's
> built-in exceptions and for those __str__() is what it is, and 

Then don't use the built-in exception. If it won't do what you want it do
do, use something else.


> (2) this
> goes against the meaning of __str__() that I find in the documentation
> in ref/customization.html which says that the return value must be a
> string object.

I didn't mean return a unicode object :) 

You're absolutely correct. Your __str__ would need to return a string
object, which means encoding the Unicode correctly to get a string object
without raising an exception.

e.g. something like this maybe (untested, not thought-through, probably
won't work correctly, blah blah blah):

def __str__(self):
    s = self.args.encode('ascii', 'replace')
    return "Unicode error converted to plain ASCII:\n" + s

or whatever encoding scheme works for your application.

[snip]

>> >>> class MyBetterException(Exception):...     def __init__(self, arg):
>> ...             self.args = arg.encode('ascii', 'replace') ...         
>>    self.unicode_arg = arg  # save the original in case
>
> This is illuminating.  How do you know that for exceptions __init__()
> should take one non-self argument?  I missed finding this information.

It can take whatever you want it to take:

class MyStupidException(Exception):
    def __init__(self, dayofweek, breakfast="spam and baked beans",
        *everythingelse):
        self.day = dayofweek
        self.breakfast = breakfast
        self.args = everythingelse
    def __str__(self):
        s = "On %s I ate %s and then an error '%s' occurred." % \
        (self.day.title(), self.breakfast, self.args)
        return s


>>> raise MyStupidException('monday', 'cheese', 'bad things', 'happened', 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
__main__.MyStupidException: On Monday I ate cheese and then an error
'('bad things', 'happened', 2)' occurred.




-- 
Steven D'Aprano 




More information about the Python-list mailing list