Newbie question about text encoding

Marko Rauhamaa marko at pacujo.net
Mon Mar 9 02:31:28 EDT 2015


Ben Finney <ben+python at benfinney.id.au>:

> Steven D'Aprano <steve+comp.lang.python at pearwood.info> writes:
>
>> '\udd00' should be a SyntaxError.
>
> I find your argument convincing, that attempting to construct a
> Unicode string of a lone surrogate should be an error.

Then we're back to square one:

   >>> b'\x80'.decode('utf-8', errors='surrogateescape')
   '\udc80'


Marko



More information about the Python-list mailing list