Need debugging knowhow for my creeping Unicodephobia

Duncan Booth duncan.booth at invalid.invalid
Wed Feb 10 14:56:20 EST 2010


kj <no.email at please.post> wrote:

> But to ground
> the problem a bit I'll say that the exception above happens during
> the execution of a statement of the form:
> 
>   x = '%s %s' % (y, z)
> 
> Also, I found that, with the exact same values y and z as above,
> all of the following statements work perfectly fine:
> 
>   x = '%s' % y
>   x = '%s' % z
>   print y
>   print z
>   print y, z
> 

One of y or z is unicode, the other is str. The statement that goes wrong 
is combining the unicode string with the other one so the result has to be 
unicode. That means whichever of y or z is str is being decoded to unicode 
is being decoded with the default of 'ascii'.

When you format them separately one assignment to x gives a unicode result, 
the other gives str.

When you print them you are encoding the unicode value to ascii and that 
isn't giving a problem.

1. Print the repr of each value so you can see which is which.
2. Explicitly decode the str to unicode using whatever encoding is correct 
(possibly utf-8) when formatting.
3. Explicitly encode back to str when printing using the encoding of your 
output device.
4. Know what types you are using: never mix str and unicode without being 
explicit.
5. When you post to this newsgroup include the full traceback, not just the 
error message.



More information about the Python-list mailing list