encode() question

7stud bbxx789_05ss at yahoo.com
Tue Jul 31 12:53:11 EDT 2007


s1 = "hello"
s2 = s1.encode("utf-8")

s1 = "an accented 'e': \xc3\xa9"
s2 = s1.encode("utf-8")

The last line produces the error:

---
Traceback (most recent call last):
  File "test1.py", line 6, in ?
    s2 = s1.encode("utf-8")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
17: ordinal not in range(128)
---

The error is a "decode" error, and as far as I can tell, decoding
happens when you convert a regular string to a unicode string.  So, is
there an implicit conversion taking place from s1 to a unicode string
before encode() is called?  By what mechanism?




More information about the Python-list mailing list