b64encode and unicode problem

Peter Otten __peter__ at web.de
Mon May 26 06:34:33 EDT 2008


Gabriel Rossetti wrote:

> Hello everyone,
> 
> I am trying to encode a string using b4encode and I get the following
> error :
> 
>  >>> b64encode(u"Salut Pierre, comment ça va?")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/usr/lib/python2.5/base64.py", line 53, in b64encode
>     encoded = binascii.b2a_base64(s)[:-1]
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in
> position 22: ordinal not in range(128)
> 
> If I remove the "u" infront of the string, it works. The problem is that
> I in my program, the string is given to me un unicode/utf-8. I tried
> several things, but I still get it, How can I get it to work, anybody
> have any idea?

>>> base64.b64encode(u"Salut Pierre, comment ça va?".encode("utf8"))
'U2FsdXQgUGllcnJlLCBjb21tZW50IMOnYSB2YT8='

unicode is a sequence of codepoints with no predetermined representation as
a byte sequence. Therefore whenever you go from unicode to str you have to
decide upon an encoding. If you don't make that decision python assumes
ascii which will of course fail for non-ascii characters.

Peter




More information about the Python-list mailing list