Encoding and norwegian (non ASCII) characters.

joakim.hove at gmail.com joakim.hove at gmail.com
Sat Oct 7 16:29:19 EDT 2006


Hello,

I am having great problems writing norwegian characters æøå to file
from a python application. My (simplified) scenario is as follows:

1. I have a web form where the user can enter his name.

2. I use the cgi module module to get to the input from the user:
    ....
    name = form["name"].value

3. The name is stored in a file

    fileH = open(namefile , "a")
    fileH.write("name:%s \n" % name)
    fileH.close()

Now, this works very well indeed as long the users have 'ascii' names,
however when someone enters a name with one of the norwegian characters
æøå - it breaks at the write() statement.

   UnicodeDecodeError: 'ascii' codec can't decode byte 0x8f in position
....

Now - I understand that the ascii codec can't be used to decode the
particular characters, however my attempts of specifying an alternative
encoding have all failed.

I have tried variants along the line:

   fileH = codecs.open(namefile , "a" , "latin-1") / fileH =
open(namefile , "a")
   fileH.write(name)   /    fileH.write(name.encode("latin-1"))

It seems *whatever* I do the Python interpreter fails to see my pledge
for an alternative encoding, and fails with the dreaded
UnicodeDecodeError.

Any tips on this would be *highly* appreciated.


Joakim




More information about the Python-list mailing list