Help needed with python unicode cgi-bin script

weheh weheh at verizon.net
Mon Dec 10 02:29:39 EST 2007


Dear web gods:

After much, much, much struggle with unicode, many an hour reading all the 
examples online, coding them, testing them, ripping them apart and putting 
them back together, I am humbled. Therefore, I humble myself before you to 
seek guidance on a simple python unicode cgi-bin scripting problem.

My problem is more complex than this, but how about I boil down one sticking 
point for starters. I have a file with a Spanish word in it, "años", which I 
wish to read with:


#!C:/Program Files/Python23/python.exe

STARTHTML= u'''Content-Type: text/html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
</head>
<body>
'''
ENDHTML = u'''
</body>
</html>
'''
print STARTHTML
print open('c:/test/spanish.txt','r').read()
print ENDHTML


Instead of seeing "año" I see "a?o". BAD BAD BAD
Yet, if I open the file with the browser (IE/Mozilla), I see "año." THIS IS 
WHAT I WANT

WHAT GIVES?

Next, I'll get into codecs and stuff, but how about starting with this?

The general question is, does anybody have a complete working example of a 
cgi-bin script that does the above properly that they'd be willing to share? 
I've tried various examples online but haven't been able to get any to work. 
I end up seeing hex code for the non-ascii characters u'a\xf1o', and later 
on 'a\xc3\xb1o', which are also BAD BAD BAD.

Thanks -- your humble supplicant. 





More information about the Python-list mailing list