Unicode in cgi-script with apache2

Dominique Ramaekers dominique at ramaekers-stassart.be
Sat Aug 16 18:36:14 EDT 2014


I fond my problem, I will describe it more at the bottom of this message...

But first...

Thanks Alister for the tips:
1) This evening, I've researched WSGI. I found that WSGI is more 
advanced than CGI and I also think WSGI is more the Python way. I'm an 
amateur playing around with my imagination on a small virtual server 
(online cloudserver.ramaekers-stassart.be). I'm trying to build 
something rather specific. I also like to make things as basic as 
possible. My first thought was not to use a framework. This because with 
a framework I didn't really know what the code is doing. For a 
framework, for me, would be a black-box. But after inspecting WSGI, I 
got the idea not to make it myself more difficult than it has to be. I 
will work with a framework and I think I'll put my chances on Falcon 
(for it's speed, small size and it doesn't seem to difficult)... There 
are a lot of frameworks, so if someone wants to point me to an other 
framework, I'm open to suggestions...

2) Your tip, to use 'encode' did not solve the problem and created a new 
one. My lines were incapsulted in quotes and I got a lot of \b's and 
\n's... and I still got the same error.

3) I didn't got the message from JMF, so...

What seems to be the problem:
My Script was ok. I know this because in the terminal I got my expected 
output. Python3 uses UTF-8 coding as a standard. The problem is, when 
python 'prints' to the apache interface, it translates the string to 
ascii. (Why, I never found an answer). Somewhere in the middle of my 
index.html file, there are letters like ë and ü. If Python tries to 
translate these, Python throws an error. If I delete these letters in 
the file, the script works perfectly in a browser! In Python2.7 the 
script can easily be tweaked so the translation to ascii isn't done, but 
in Python3, its a real pain in the a... I've read about people who 
managed to force Python3 to 'print' to apache in UTF-8, but none of 
their solutions worked for me.
I think the programmers of Python doesn't want to focus on Python + 
apache + CGI (I think it only happens with apache and not with an other 
http-server). I don't think they do this intentional but I guess they 
assume that if you use Python to make a web-application, you also use 
mod_wsgi or mod_python (in apache)...
So I'll use wsgi, It's a little more work but it seems really neat...

grtz


Op 15-08-14 om 21:27 schreef alister:
> On Fri, 15 Aug 2014 20:10:25 +0200, Dominique Ramaekers wrote:
>
>> Hi,
>>
>> I've got a little script:
>>
>> #!/usr/bin/env python3 print("Content-Type: text/html")
>> print("Cache-Control: no-cache, must-revalidate")    # HTTP/1.1
>> print("Expires: Sat, 26 Jul 1997 05:00:00 GMT") # Date in the past
>> print("")
>> f = open("/var/www/cgi-data/index.html", "r")
>> for line in f:
>>       print(line,end='')
>>
>> If I run the script in the terminal, it nicely prints the webpage
>> 'index.html'.
>>
>> If access the script through a webbrowser, apache gives an error:
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
>> 1791: ordinal not in range(128)
>>
>> I've done a hole afternoon of reading on fora and blogs, I don't have a
>> solution.
>>
>> Can anyone help me?
>>
>> Greetings,
>>
>> Dominique.
> 1) this is not the way to get python to generate a web page, if you dont
> want to use an existing framework (for example if you are doing this ans
> an educational exercise) i suggest to google SWGI
>
> 2) you need to encode your output strings  into a format apache/html
> protocols can support - UTF8 is probably best here.
> change your pint function to
> print(line.encode('utf'),end='')
>
>
> 3) Ignore any subsequent advice from JMF even when he is trying to help
> he is invariable wrong.
>   
>




More information about the Python-list mailing list