Encoding of surrogate code points to UTF-8

Neil Cerutti neilc at norwich.edu
Tue Oct 8 11:54:30 EDT 2013


On 2013-10-08, Neil Cerutti <neilc at norwich.edu> wrote:
> In any case, "\ud800\udc01" isn't a valid unicode string. In a
> perfect world it would automatically get converted to
> '\u00010001' without intervention.

This last paragraph is erroneous. I must have had a typo in my
testing.

-- 
Neil Cerutti



More information about the Python-list mailing list