utf-8 coding sometimes it works, most of the time it don't work.

Ulrich Eckhardt eckhardt at satorlaser.com
Wed Sep 22 03:16:12 EDT 2010


Stef Mientki wrote:
> When running this python application from the command line ( or launched
> from another Python program), the wrong character encoding (probably
> windows-1252) is used.

Rule #1: If you know the correct encoding, set it yourself. This
particularly applies to files you open yourself (use the codec module). In
the case of your program, I guess the stream with the faulty encoding is
stdin/stdout, who's encoding is guessed by Python, but which you can
override. Check sys.stdin.encoding.

> When I run this program from PyScripter ( either internal engine or remote
> engine), MSHTML shows the correct character encoding,
> perfect!

Interesting, I would say that PyScripter sets up the environment
differently, so that Python guesses a different encoding. Also make sure
both are calling the same Python, I get 'cp850' or 'US-ASCII' depending on
whether I call the native MS Windows Python or the Cygwin Python.

> In the main file, and in the major files that constains strings I've added
> the following 2 lines:
> # -*- coding: utf-8 -*-
> from __future__ import absolute_import, unicode_literals

This shouldn't matter. This just tells Python that the sourcecode itself is
encoded in UTF-8 and that you want to use Unicode names in your string
literals, it doesn't affect the output of your program.

Uli

-- 
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932




More information about the Python-list mailing list