Python strings outside the 128 range
Diez B. Roggisch
deets at nospam.web.de
Thu Jul 13 06:35:10 EDT 2006
Sébastien Boisgérault schrieb:
> Hi,
>
> Could anyone explain me how the python string "é" is mapped to
> the binary code "\xe9" in my python interpreter ?
>
> "é" is not present in the 7-bit ASCII table that is the default
> encoding, right ? So is the mapping "é" -> "\xe9" portable ?
> (site-)configuration dependent ? Can anyone have something
> different of "é" when 'print "\xe9"' is executed ? If the process
> is config-dependent, what kind of config info is used ?
The default encoding has nothing to do with this. "\xe9" is just a byte.
You can write it into a file (which the terminal is basically), and no
default encoding whatsoever in the mix.
The default-encoding comes into play when you write unicode(!) strings
to a file. Then the unicode string is converted to a byte string using
the default-eocoding. Which will fail miserably if the default encoding
is ascii (as it is supposed to be) and your unicode string contains any
"funny" characters.
But even if you encode the unicode string explicitely with an encoding
like latin1 or utf-8, the resulting byte strings will just be written to
the file. And it is a totally different question (and actually not
controllable by you/python) if the terminal will interpret the bytes
correct or not.
Diez
More information about the Python-list
mailing list