[Tutor] Encoding

Dave Angel davea at ieee.org
Wed Mar 3 22:23:25 CET 2010


(Don't top-post.  Put your response below whatever you're responding to, 
or at the bottom.)

Giorgio wrote:
> Ok.
>
> So, how do you encode .py files? UTF-8?
>
> 2010/3/3 Dave Angel <davea at ieee.org>
>
>   
I personally use Komodo to edit my python source files, and tell it to 
use UTF8 encoding.  Then I add a encoding line as the second line of the 
file.  Many times I get lazy, because mostly my source doesn't contain 
non-ASCII characters.  But if I'm copying characters from an email or 
other Unicode source, then I make sure both are set up.  The editor will 
actually warn me if I try to save a file as ASCII with any 8 bit 
characters in it.

Note:  unicode is 16 bit characters, at least in CPython 
implementation.  UTF-8 is an 8 bit encoding of that Unicode, where 
there's a direct algorithm to turn 16 or even 32 bit Unicode into 8 bit 
characters.  They are not the same, although some people use the terms 
interchangeably.

Also note:  An 8 bit string  has no inherent meaning, until you decide 
how to decode it into Unicode.  Doing explicit decodes is much safer, 
rather than assuming some system defaults.  And if it happens to contain 
only 7 bit characters, it doesn't matter what encoding you specify when 
you decode it.  Which is why all of us have been so casual about this.




More information about the Tutor mailing list