What encoding does u'...' syntax use?

Fri Feb 20 17:43:13 EST 2009

Ron Garret <rNOSPAMon at flownet.com> writes:
> Put this another way: I would have thought that when the Python parser
> parses "u'\xb5'" it would produce the same result as calling
> unicode('\xb5'), but it doesn't. Instead it seems to produce the same
> result as calling unicode('\xb5', 'latin-1'). But my default encoding
> is not latin-1, it's ascii. So where is the Python parser getting its
> encoding from? Why does parsing "u'\xb5'" not produce the same error
> as calling unicode('\xb5')?

There is no encoding involved other than ascii, only processing of a
backslash escape.

The backslash escape '\xb5' is converted to the unicode character whose
ordinal number is B5h. This gives the same result as
"\xb5".decode("latin-1") because the unicode numbering is the same as
the 'latin-1' numbering in that range.

-M-