Foreign Character Problems In Python 2.5 and Tkinter

Juha S. jusa.sj at gmail.com
Sat Oct 13 09:13:21 EDT 2007


Thanks for the reply. I made changes to my code according to your 
example. Now any Scandinavian characters that are outputted by the 
program are missing in the Tk text box.

I'm using a loading function like this to load the data that is to be 
outputted by the program:

def loadWords(self, filename):
        ret = []
       
        try:
            file = codecs.open(filename, 'r', 'utf-8', 'ignore')
            for line in file:
                if line.isspace() == False: #Must skip blank lines (read 
only lines that contain text).
                    line = line.replace(u'\n', u'')
                    ret.append(line)
        except IOError:
            tkMessageBox.showwarning(u'Open File', u'An error occurred 
wile trying to load \"' + filename + u'\"', parent=self.frame)
        finally:
            file.close()
       
        return ret


Also, the newlines are still lost when saving the text widget contents 
to a file. I'm inserting the program generated text to the text widget 
through "text.insert(END, txt + u'\n\n')".


Janne Tuukkanen wrote:
> Juha S. kirjoitti:
>   
>> problem is that when I try to save its contents to a .txt file, any 
>> Scandinavian letters such as "äöå ÄÖÅ" are saved incorrectly and show up 
>> as a mess when I open the .txt file in Windows Notepad.
>>
>> It seems that the characters will only get mixed if the user has typed 
>> them into the widget, but if the program has outputted them, they are 
>> saved correctly.
>>     
>
>  Did you define the encoding for the source file and
> put u (for unicode) in front of your strings. The
> following piece produces proper UTF-8. Couldn't test with
> Notepad though, no Windows here.
>
>  Note this message is also encoded in UTF-8, so should be
> your editor. I can't believe we are still messing with this
> stuff in 2007. In old bad days it was easy, you should
> only learn to read { as ä, | as ö etc... and vice versa
> with localized terminals -- C code looked rather exotic
> with a-umlauts everywhere ;)
>
>
> #!/usr/bin/python
> # -*- coding: utf-8 -*-
>
> from Tkinter import *
> import codecs
>
> class Application(Frame):
>     def save(self):
>         FILE = codecs.open("outfile.txt", "w", "utf-8")
>         FILE.write(u"START - åäöÅÄÖ\n")
>         FILE.write(self.text_field.get(0.0, END))
>         FILE.write(u"END - åäöÅÄÖ\n")
>         FILE.close()
>         self.quit()
>         
>     def __init__(self, master=None):
>         Frame.__init__(self, master)
>         self.grid()
>
>         self.text_field = Text(self, width=40, height=10)
>         self.text_field.grid()
>         
>         self.save_button = Button(self, text="save and exit", command=self.save)
>         self.save_button.grid()
>         
> if __name__ == "__main__":
>     app = Application()
>     app.mainloop()
>
>   




More information about the Python-list mailing list