Does IDLE handle unicode?

polux polux2001 at wanadoo.fr
Mon Sep 2 09:52:21 EDT 2002


Edward K. Ream wrote:
> IDLE took an exception writing a .py file containing the copyright
> character.  This happened a while ago; IIRC, I lost the file.
> 
> Is this a known problem?  I see nothing about this in the FAQ.
> 
> Edward
> --------------------------------------------------------------------
> Edward K. Ream   email:  edream at tds.net
> Leo: Literate Editor with Outlines
> Leo: http://personalpages.tds.net/~edream/front.html
> --------------------------------------------------------------------
> 
> 
> 

Read in the FAQ :
----------------------------
4.102. UnicodeError: ASCII [decoding,encoding] error: ordinal not in 
range(128)
This error indicates that your Python installation can handle only 7-bit 
ASCII strings. There are a couple ways to fix or workaround the problem.

If your programs must handle data in arbitary character set encodings, 
the environment the application runs in will generally identify the 
encoding of the data it is handing you. You need to convert the input to 
Unicode data using that encoding. For instance, a program that handles 
email or web input will typically find character set encoding 
information in Content-Type headers. This can then be used to properly 
convert input data to Unicode. Assuming the string referred to by 
"value" is encoded as UTF-8:

     value = unicode(value, "utf-8")

will return a Unicode object. If the data is not correctly encoded as 
UTF-8, the above call will raise a UnicodeError.

If you only want strings coverted to Unicode which have non-ASCII data, 
you can try converting them first assuming an ASCII encoding, and then 
generate Unicode objects if that fails:

     try:
         x = unicode(value, "ascii")
     except UnicodeError:
         value = unicode(value, "utf-8")
     else:
         # value was valid ASCII data
         pass

If you normally use a character set encoding other than US-ASCII and 
only need to handle data in that encoding, the simplest way to fix the 
problem may be simply to set the encoding in sitecustomize.py. The 
following code is just a modified version of the encoding setup code 
from site.py with the relevant lines uncommented.

     # Set the string encoding used by the Unicode implementation.
     # The default is 'ascii'
     encoding = "ascii" # <= CHANGE THIS if you wish

     # Enable to support locale aware default string encodings.
     import locale
     loc = locale.getdefaultlocale()
     if loc[1]:
         encoding = loc[1]
     if encoding != "ascii":
         import sys
         sys.setdefaultencoding(encoding)

Also note that on Windows, there is an encoding known as "mbcs", which 
uses an encoding specific to your current locale. In many cases, and 
particularly when working with COM, this may be an appropriate default 
encoding to use.

-------------------------------


try to use mbcs on windows
i did it and it works




More information about the Python-list mailing list