[Pythonmac-SIG] read content from latin-1 file, write it to ut8 file

frank h. frank.hoffsummer at gmail.com
Mon Apr 17 18:49:06 CEST 2006


Hello,
I am using Mac Python 2.4.1 on Mac OS X 10.4 and I cannot seem to be able to
read from a latin-1 file and then write to a UTF8 file correctly

Using Textwrangler on OS X, I create a latin-1 file with some special
characters in it and save it as "test.txt"

I am reading the textfile as such:

   f = codecs.open('test.txt', 'r', 'latin-1')
   content = f.read()
   f.close()

   type(content)
   <type 'unicode'>

all good. I can even

   print content.encode('utf8')
   äöåäöäööåäöäöå

(having set sys.defaultencoding to 'utf8' in siteconfig.py).
Now I want to create a new utf8 file and write "content" into it. I do the
following:

   f=codecs.open('newtest.txt','w','utf-8')
   f.write(content)
   f.close()

my problem is, that when I open "newtest.txt" in Textwrangler again,
Textwrangler recognizes the file as "MacRoman" encoded and the content is
garbled.
The same thing happens if I try to write content to a "latin-1" file again
whats happening?

thanks for any insight you might have
-frank
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/pythonmac-sig/attachments/20060417/785d63dc/attachment.htm 


More information about the Pythonmac-SIG mailing list