locale.CODESET / different in python shell and scripts

"Martin v. Löwis" martin at v.loewis.de
Thu Apr 29 22:30:34 EDT 2004


Nuff Said wrote:
> When I add the following line to the above script
> 
>   print u"schönes Mädchen".encode(encoding)
> 
> the result is:
> 
>   schönes Mädchen    (with my self-compiled Python 2.3)
>   schönes Mädchen  (with Fedora's Python 2.2)
> 
> I observed, that my Python gives me (the correct value) 15 for
> len(u"schönes Mädchen") whereas Fedora's Python says 17 (one more
> for each German umlaut, i.e. the len of the UTF-8 representation of
> the string; observe, that the file uses the coding cookie for UTF-8).
> Maybe Fedora's Python was compiled without Unicode support?

Certainly not: It would not support u"" literals without Unicode.

Please understand that you can use non-ASCII characters in source
code unless you also use the facilities described in

http://www.python.org/peps/pep-0263.html

So instead of "ö", you should write "\xf6".

> Is there something I do utterly wrong here? 

Yes, you are.

> Python can't be that complicated?

Python is not. Encodings are.

Regards,
Martin




More information about the Python-list mailing list