Need debugging knowhow for my creeping Unicodephobia
Duncan Booth
duncan.booth at invalid.invalid
Wed Feb 10 14:56:20 EST 2010
kj <no.email at please.post> wrote:
> But to ground
> the problem a bit I'll say that the exception above happens during
> the execution of a statement of the form:
>
> x = '%s %s' % (y, z)
>
> Also, I found that, with the exact same values y and z as above,
> all of the following statements work perfectly fine:
>
> x = '%s' % y
> x = '%s' % z
> print y
> print z
> print y, z
>
One of y or z is unicode, the other is str. The statement that goes wrong
is combining the unicode string with the other one so the result has to be
unicode. That means whichever of y or z is str is being decoded to unicode
is being decoded with the default of 'ascii'.
When you format them separately one assignment to x gives a unicode result,
the other gives str.
When you print them you are encoding the unicode value to ascii and that
isn't giving a problem.
1. Print the repr of each value so you can see which is which.
2. Explicitly decode the str to unicode using whatever encoding is correct
(possibly utf-8) when formatting.
3. Explicitly encode back to str when printing using the encoding of your
output device.
4. Know what types you are using: never mix str and unicode without being
explicit.
5. When you post to this newsgroup include the full traceback, not just the
error message.
More information about the Python-list
mailing list