Unicode in cgi-script with apache2

Dominique Ramaekers dominique at ramaekers-stassart.be
Sun Aug 17 01:32:07 EDT 2014


* My system is a linux-box.

* I've tried using encoding="utf-8". It didn't fix things.

* That print uses sys.stdout would explain, using sys.stdout isn't better.

* My locale and the system-wide locale is UTF-8. Using SetEnv 
PYTHONIOENCODING utf-8 didn't fix things

* The file is encoded UTF-8...

I can not speak for anybody else but in my search I don't believe to 
have read about someone who had the problem on a Windows-system. They 
all used linux (different kinds of flavors) or OS-X... This is the first 
time I've encountered a situation where Windows is better in encoding 
issues :P +1 for Microsoft...

I think that Apache (*nix versions) doesn't tell Python, she's accepting 
UTF-8. Or Python doesn't listen right... Maybe I should place a bug 
report in both projects?


Op 17-08-14 om 04:50 schreef Denis McMahon:
> On Sun, 17 Aug 2014 00:36:14 +0200, Dominique Ramaekers wrote:
>
>> What seems to be the problem:
>> My Script was ok. I know this because in the terminal I got my expected
>> output. Python3 uses UTF-8 coding as a standard. The problem is, when
>> python 'prints' to the apache interface, it translates the string to
>> ascii. (Why, I never found an answer).
> Is the apache server running on a linux or a windows platform?
>
> The problem may not be python, it may be the underlying OS. I wonder if
> apache is spawning a process for python though, and if so whether it is
> in some way constraining the character set available to stdout of the
> spawned process.
>
>  From your other message, the error appears to be a python error on
> reading the input file. For some reason python seems to be trying to
> interpret the file it is reading as ascii.
>
> I wonder if specifying the binary data parameter and / or utf-8 encoding
> when opening the file might help.
>
> eg:
>
> f = open( "/var/www/cgi-data/index.html", "rb" )
> f = open( "/var/www/cgi-data/index.html", "rb", encoding="utf-8" )
> f = open( "/var/www/cgi-data/index.html", "r", encoding="utf-8" )
>
> I've managed to drive down a bit further in the problem:
>
> print() goes to sys.stdout
>
> This is part of what the docs say about sys.stdout:
>
> """
> The character encoding is platform-dependent. Under Windows, if the
> stream is interactive (that is, if its isatty() method returns True), the
> console codepage is used, otherwise the ANSI code page. Under other
> platforms, the locale encoding is used (see locale.getpreferredencoding
> ()).
>
> Under all platforms though, you can override this value by setting the
> PYTHONIOENCODING environment variable before starting Python.
> """
>
> At this point, details of the OS become very significant. If your server
> is running on a windows platform you may need to figure out how to make
> apache set the PYTHONIOENCODING environment variable to "utf-8" (or
> whatever else is appropriate) before calling the python script.
>
> I believe that the following line in your httpd.conf may have the
> required effect.
>
> SetEnv PYTHONIOENCODING utf-8
>
> Of course, if the file is not encoded as utf-8, but rather something
> else, then use that as the encoding in the above suggestions. If the
> server is not running windows, then I'm not sure where the problem might
> be.
>




More information about the Python-list mailing list