How to use 8bit character sets?

John Machin sjmachin at lexicon.net
Sun Jun 12 23:47:53 EDT 2005


copx wrote:
> For some reason Python (on Windows) doesn't use the system's default 
> character set and that's a serious problem for me.
> I need to process German textfiles (containing umlauts and other > 7bit 
> ASCII characters) and generally work with strings which need to be processed 
> using the local encoding (I need to display the text using a Tk-based GUI 
> for example). The only solution I managed to find was converting between 
> unicode and latin-1 all the time (the textfiles aren't unicode, the output 
> of the program isn't supposed to be unicode either). Everything worked fine 
> until I tried to run the program on a Windows 9x machine.. It seems that 
> Python on Win9x doesn't really support unicode (IIRC Win9x doesn't have real 
> unicode support so that's not suprising).
> Is it possible to tell Python to use an 8bit charset (latin-1 in my case) 
> for textfile and string processing by default?
> 
> copx


1. Your description of your problem is extremely vague. If you were to 
supply a minimal script that "works" [on what platform?? what version of 
Python??], with a description of what you understand by "works", and 
what happens differently when you run that script on a Win9x box [for 
what value(s) of x?? what version of Python??], we might be able to help 
you. N.B. somewhere near the top of the script you should have something 
like:

import sys
print "Python version:", sys.version
print "platform:", sys.platform
print "default encoding:", sys.getdefaultencoding()
try:
     print "Windows version:", sys.getwindowsversion()
except AttributeError:
     print "sys.getwindowsversion not available"

2. You should read this:

http://www.catb.org/~esr/faqs/smart-questions.html

3. You should not rely on a crutch like a default encoding, especially 
one obtained by a kludge like sitecustomize.py. If your app expects to 
receive data in encoding x and send data in encoding y, these facts are 
properties of the application and the data, NOT the box you are running 
on. If you had a requirement to read MacCyrillic from a Classic Mac and 
write KOI8 for consumption on a Windows PC, you should be able to do it 
on a SPARC Solaris box in Timbuktu or Walla Walla, Wa., without having 
to fiddle with site-wide configuration.

4. AFAIK, support for Unicode is provided by Python with no assistance 
from the operating system. The multitudinous deficiencies in Win9x 
should have no bearing on the problem. Have you tried to run your 
program on a Win2K or WinXP box?

HTH,

John



More information about the Python-list mailing list