[Python-Dev] Encoding of 8-bit strings and Python source code
Guido van Rossum
guido@python.org
Tue, 25 Apr 2000 18:35:30 -0400
[Fredrik]
> -- my proposal: expose both types, but let them contain characters
> from the same character set -- at least when used as strings.
>
> as before, 8-bit strings can be used to store binary data, so we
> don't need a separate ByteArray type. in an 8-bit string, there's
> always one character per byte.
>
> [imho: small changes to the existing code base, about as efficient as
> can be, no attempt to second-guess the user, fully backwards com-
> patible, fully compliant with the definition of strings in the language
> reference, patches are available, etc...]
Sorry, all this proposal does is change the default encoding on
conversions from UTF-8 to Latin-1. That's very
western-culture-centric.
You already have control over the encoding: use unicode(s,
"latin-1"). If there are places where you don't have enough control
(e.g. file I/O), let's add control there.
--Guido van Rossum (home page: http://www.python.org/~guido/)