Newbie question about text encoding
Chris Angelico
rosuav at gmail.com
Sun Mar 8 03:37:34 EDT 2015
On Sun, Mar 8, 2015 at 6:20 PM, Marko Rauhamaa <marko at pacujo.net> wrote:
> * it still isn't bijective between str and bytes:
>
> >>> '\udd00'.encode('utf-8', errors='surrogateescape')
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> UnicodeEncodeError: 'utf-8' codec can't encode character
> '\udd00' in position 0: surrogates not allowed
Once again, you appear to be surprised that invalid data is failing.
Why is this so strange? U+DD00 is not a valid character. It is quite
correct to throw this error.
ChrisA
More information about the Python-list
mailing list