How to get Python to default to UTF8

John Nagle nagle at animats.com
Mon Dec 24 00:30:40 EST 2007


weheh wrote:
> Hi Fredrik,
> 
> Thanks again for your feedback. I am much obliged.
> 

    Bear in mind that in Python, ASCII currently means ASCII, values
0..127.  Type "str" will accept values > 127.  However, the default
conversion from "str" to "unicode" requires true ASCII values, in
0..127.  So if you take in data from some source which might have
a byte value > 127, the default conversion to Unicode won't work.

    There are conversion functions for specifying the meaning of
values 128..255, (the input might be "latin1" encoding, for
example), or ignoring unexpected characters, or converting them
to "?".

				John Nagle



More information about the Python-list mailing list