error messages containing unicode

Jim jhefferon at smcvt.edu
Wed Jan 31 07:57:18 EST 2007


Thanks Steve, I appreciate your patience.

On Jan 31, 1:39 am, Steven D'Aprano <s... at REMOVEME.cybersource.com.au>
wrote:
> If the built-in isn't Unicode aware, subclassing it won't magically make
> it so :-)
Oh, I agree.  If I have a string mesg that is unicode-not-ascii and I
say
  try:
      raise Exception mesg
  except Exception, err:
      print "Trouble"+mesg
then I have problems.   I  however am under the impression, perhaps
mistaken, that the built-in exceptions in the library will return as
error strings only ascii.  (I take as evidence of my understanding
that the built-in exceptions have a __str__() method but do not have
an explicit __unicode__() and so rely on a unicode(err) call being
passed on to __str__().  But as I've said above, I've been wrong so
many times before.  ;-)

My main point about the built-ins is that I want to catch them along
with my own exceptions.  That's what I meant by the next paragraph.
My class myException is a subclass of Exception so I can catch my
stuff and the standard stuff with an all-in-one panic button.
>
> > For instance, I write a lot of CGI and I want to wrap everything in a
> > try .. except.
> >   try:
> >       main()
> >   except Exception, err:
> >       print "Terrible blunder: ",str(err)
> > so that the err can be one of my exceptions, or can be one that came
> > with Python.
> > (And, that I can see, err.args can be either the relevant
> > string or a tuple containing the relevant string and the documentation
> > is silent on whether in the built-in exceptions if err.args is a tuple
> > then the string is guaranteed to be first in the tuple.)
>
> Does it matter? Just print the tuple.
In truth, it does matter.  In that example, for instance, some error
message is passed on to the user and I don't want it to be too bad.
"Database cannot be opened" is better than a "(u'Database cannot be
opened,1)"-type thing.  Besides which, Python is a nice language, and
I'm certain there is a nice way to do this; it is just that I'm having
trouble making it out.

> >> (2) convert the file name to ASCII before you store it; or
>
> > I need the non-ascii information, though, which is why I included it
> > in the error message.
>
> If you have the exception captured in "err", then you can grab it with
> err.where_i_put_the_unicode.
I want a method of grabbing it that is the same as the method used by
the built-ins, for the uniformity reasons that I gave above.   That I
could make out, the documentation was silent on what is the approved
way to grab the string.

> >> (3) add a __str__ method to your exception that is Unicode aware.
>
> > I have two difficulties with this: (1) as above I often raise Python's
> > built-in exceptions and for those __str__() is what it is, and
>
> Then don't use the built-in exception. If it won't do what you want it do
> do, use something else.
I use my exceptions for errors in my logic, etc.  But not being
perfect, sometimes I raise exceptions that I had not anticipated;
these are built-ins.

> > (2) this
> > goes against the meaning of __str__() that I find in the documentation
> > in ref/customization.html which says that the return value must be a
> > string object.
>
> I didn't mean return a unicode object :)
>
> You're absolutely correct. Your __str__ would need to return a string
> object, which means encoding the Unicode correctly to get a string object
> without raising an exception.
>
> e.g. something like this maybe (untested, not thought-through, probably
> won't work correctly, blah blah blah):
>
> def __str__(self):
>     s = self.args.encode('ascii', 'replace')
>     return "Unicode error converted to plain ASCII:\n" + s
>
> or whatever encoding scheme works for your application.
I did discuss this suggestion from another person above.  That would
mean either (a) throwing away the unicode-not-ascii parts of the error
message (but I want those parts, which is why I put them in there) or
(b) hard-coding the output encoding for error strings in hundreds of
error cases (yes, I have hundreds) or (c) passing as a parameter the
errorEncoding to each function that I write.  That last case doesn't
seem to be to be a likely best practice for such a nice language as
Python; I want a way to get the unicode object and go forward in the
program with that.

> It can take whatever you want it to take:
>
> class MyStupidException(Exception):
>     def __init__(self, dayofweek, breakfast="spam and baked beans",
>         *everythingelse):
>         self.day = dayofweek
>         self.breakfast = breakfast
>         self.args = everythingelse
>     def __str__(self):
>         s = "On %s I ate %s and then an error '%s' occurred." % \
>         (self.day.title(), self.breakfast, self.args)
>         return s
>
> >>> raise MyStupidException('monday', 'cheese', 'bad things', 'happened', 2)
>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> __main__.MyStupidException: On Monday I ate cheese and then an error
> '('bad things', 'happened', 2)' occurred.
Thank you for the example; I learned something from it.  But as I
mentioned above, I need to guard against the system raising built-ins
also and so I am still a bit puzzled by how to get at the error
strings in built-ins.

In case anyone is still reading this :-)  someone else suggested the
err.message attribute.  I had missed that in the documentation
somehow, but on rereading it, I thought he had solved my problem.
However, sadly, I cannot get Python to like a call to err.message:
..........................................................
$ python
Python 2.4.4c1 (#2, Oct 11 2006, 21:51:02)
[GCC 4.1.2 20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> try:
...     raise Exception, 'this is the error message'
... except Exception, err:
...     print "result: ",err.message
...
result:
Traceback (most recent call last):
  File "<stdin>", line 4, in ?
AttributeError: Exception instance has no attribute 'message'
>>>
......................................................................................

So, in case it helps anyone, I am going with this:
.....................................................................................
def errorValue(err):
    """Return the string error message from an exception message
string.
      err  exception instance
    Note: I cannot get err.message to work.  I sent a note to clp on
    Jan 29 2007 with a related query and this is the best that I
figured
    out.
    """
    return err[0]

class jhError(StandardError):
    """Subclass this to get exceptions that behave correctly when
    you do this.
      try:
          raise subclassOfJhError, 'some error message with unicode
chars'
      except subclassOfJhError, err
          mesg='the message is '+unicode(err)
    """
    def __unicode__(self):
        return errorValue(self)

class myException(jhError):
    pass
....................................................................................

No doubt I'll discover what is wrong with it today.  :-)

Jim




More information about the Python-list mailing list