Message from exception raised in generator disappears
Scott David Daniels
Scott.Daniels at Acm.Org
Sat Oct 16 21:17:54 EDT 2004
Mike Brown wrote:
> I thought I was being pretty clever with my first attempt at using generators,
> but I seem to be missing some crucial concept, for even though this seems to
> work as intended, the text of the exception message does not bubble up with
> either of the ValueErrors when one of them is raised.
>
>
> # This helps iterate over a unicode string. When python is built with
> # 16-bit chars (as is the default on Windows), it returns surrogate
> # pairs together (unlike 'for c in s'), and detects illegal surrogate
> # pairs. Byte strings are unaffected.
> def chars(s):
> surrogate = None
> for c in s:
> cp = ord(c)
> if surrogate is not None:
> if cp > 56319 and cp < 57344:
> pair = surrogate + c
> surrogate = None
> yield pair
> else:
> raise ValueError("Bad surrogate pair in %s" % s)
> else:
> if cp > 55295 and cp < 57344:
> if cp < 56320:
> surrogate = c
> else:
> raise ValueError("Bad surrogate pair in %s" %s)
> else:
> surrogate = None
> yield c
> if surrogate is not None:
> raise ValueError("Bad surrogate pair at end of %s" % s)
>
>
> # as expected, returns u'example \xe9...\u2022...\U00010000...\U0010fffd'
> ''.join([c for c in chars(u'example \xe9...\u2022...\ud800\udc00...\U0010fffd')])
>
> # now test the 3 exception conditions. Each produces a ValueError
> ''.join([c for c in chars(u'2nd half bad: \ud800bogus')])
> ''.join([c for c in chars(u'no 1st half: \udc00')])
> ''.join([c for c in chars(u'no 2nd half: \ud800')])
>
>
> All 3 result of the exception tests result in a bare ValueError; there's no
> "Bad surrogate pair in" message shown. Why is thta? What am I doing wrong?
The problem is that type('abc%s' % u'\udc00') is unicode, not str.
Change your raises to something like:
raise ValueError("Bad surrogate pair at end of %r" % s)
and the you can relax.
-Scott David Daniels
Scott.Daniels at Acm.Org
More information about the Python-list
mailing list