Anyway to designating the encoding of the "source" for compile?

Serge.Orlov at gmail.com Serge.Orlov at gmail.com
Tue May 17 02:51:51 EDT 2005


janeaustin... at hotmail.com wrote:
> > Okay, I'll use one of the CJK codecs as the example. EUC-KR is the
> > default encoding.
> >
> > >>> import sys;sys.getdefaultencoding()
> > 'euc-kr'
> > >>> '한글'
> > '\xc7\xd1\xb1\xdb'

That is the problem. Non-ascii characters in byte strings are
deprecated. Here is what I get when I run a deprecated hello
world program in Russian:
------- hello.py ---------
print "Здравствуй, мир!"
--------------------------
C:\py>c:\Python24\python.exe hello.py
sys:1: DeprecationWarning: Non-ASCII character '\xc7' in file
text.py on line 1,  but no encoding declared; see
http://www.python.org/peps/pep-0263.html for details
╟фЁртёЄтєщ, ьшЁ!
--------------------------
Oops, not only there is a warning, but it doesn't even work
on Windows in Russian locale. To correct the program I need
to switch to unicode strings:
------- hello.py ---------
# -*- coding: windows-1251 -*-
print u"Здравствуй, мир!"
--------------------------
C:\py>c:\Python24\python.exe hello.py
Здравствуй, мир!
--------------------------

Since non-ascii characters are deprecated in byte strings,
any non-ascii encoding for sys.getdefaultencoding() is
deprecated as well. Don't set it to 'euc-kr'.


> Any suggestions or any plans in future python versions?

In python 3.0 byte strings will be gone. So you won't be
able to put non-ascii characters into them.


  Serge.




More information about the Python-list mailing list