Unicode in cgi-script with apache2

Denis McMahon denismfmcmahon at gmail.com
Sat Aug 16 22:50:51 EDT 2014


On Sun, 17 Aug 2014 00:36:14 +0200, Dominique Ramaekers wrote:

> What seems to be the problem:
> My Script was ok. I know this because in the terminal I got my expected
> output. Python3 uses UTF-8 coding as a standard. The problem is, when
> python 'prints' to the apache interface, it translates the string to
> ascii. (Why, I never found an answer).

Is the apache server running on a linux or a windows platform?

The problem may not be python, it may be the underlying OS. I wonder if 
apache is spawning a process for python though, and if so whether it is 
in some way constraining the character set available to stdout of the 
spawned process.

>From your other message, the error appears to be a python error on 
reading the input file. For some reason python seems to be trying to 
interpret the file it is reading as ascii.

I wonder if specifying the binary data parameter and / or utf-8 encoding 
when opening the file might help.

eg:

f = open( "/var/www/cgi-data/index.html", "rb" )
f = open( "/var/www/cgi-data/index.html", "rb", encoding="utf-8" )
f = open( "/var/www/cgi-data/index.html", "r", encoding="utf-8" )

I've managed to drive down a bit further in the problem:

print() goes to sys.stdout

This is part of what the docs say about sys.stdout:

"""
The character encoding is platform-dependent. Under Windows, if the 
stream is interactive (that is, if its isatty() method returns True), the 
console codepage is used, otherwise the ANSI code page. Under other 
platforms, the locale encoding is used (see locale.getpreferredencoding
()).

Under all platforms though, you can override this value by setting the 
PYTHONIOENCODING environment variable before starting Python.
"""

At this point, details of the OS become very significant. If your server 
is running on a windows platform you may need to figure out how to make 
apache set the PYTHONIOENCODING environment variable to "utf-8" (or 
whatever else is appropriate) before calling the python script.

I believe that the following line in your httpd.conf may have the 
required effect.

SetEnv PYTHONIOENCODING utf-8

Of course, if the file is not encoded as utf-8, but rather something 
else, then use that as the encoding in the above suggestions. If the 
server is not running windows, then I'm not sure where the problem might 
be.

-- 
Denis McMahon, denismfmcmahon at gmail.com



More information about the Python-list mailing list