encoding problem

Joe Strout joe at strout.net
Fri Dec 19 10:20:07 EST 2008


Marc 'BlackJack' Rintsch wrote:

>> The question is why the Python interpreter use the default encoding
>> instead of "utf-8", which I explicitly declared in the source.
> 
> Because the declaration is only for decoding unicode literals in that 
> very source file.

And because strings in Python, unlike in (say) REALbasic, do not know 
their encoding -- they're just a string of bytes.  If they were a string 
of bytes PLUS an encoding, then every string would know what it is, and 
things like conversion to another encoding, or concatenation of two 
strings that may differ in encoding, could be handled automatically.

I consider this one of the great shortcomings of Python, but it's mostly 
just a temporary inconvenience -- the world is moving to Unicode, and 
with Python 3, we won't have to worry about it so much.

Best,
- Joe







More information about the Python-list mailing list