Unicode literals and byte string interpretation.

David Riley fraveydank at gmail.com
Thu Oct 27 23:37:01 EDT 2011


On Oct 27, 2011, at 11:05 PM, Fletcher Johnson wrote:

> If I create a new Unicode object u'\x82\xb1\x82\xea\x82\xcd' how does
> this creation process interpret the bytes in the byte string? Does it
> assume the string represents a utf-16 encoding, at utf-8 encoding,
> etc...?
> 
> For reference the string is これは in the 'shift-jis' encoding.

Try it and see!  One test case is worth a thousand words.  And Python has an interactive interpreter. :-)


- Dave


More information about the Python-list mailing list