how to write a unicode string to a file ?

Mark Tolonen metolone+gmane at gmail.com
Sat Oct 17 02:28:29 EDT 2009


"Kee Nethery" <kee at kagi.com> wrote in message 
news:AAAB63C6-6E44-4C07-B119-972D4F49E511 at kagi.com...
>
> On Oct 16, 2009, at 5:49 PM, Stephen Hansen wrote:
>
>> On Fri, Oct 16, 2009 at 5:07 PM, Stef Mientki  <stef.mientki at gmail.com> 
>> wrote:
>
> snip
>
>> The thing is, I'd be VERY surprised (neigh, shocked!) if Excel can't 
>> open a file that is in UTF8-- it just might need to be TOLD that its 
>> utf8 when you go and open the file, as UTF8 looks just like ASCII --  
>> until it contains characters that can't be expressed in ASCII. But I 
>> don't know what type of file it is you're saving.
>
> We found that UTF-16 was required for Excel. It would not "do the  right 
> thing" when presented with UTF-8.

Excel seems to expect a UTF-8-encoded BOM (byte order mark) to correctly 
decide a file is written in UTF-8.  This worked for me:

f=codecs.open('test.csv','wb','utf-8')
f.write(u'\ufeff') # write a BOM
f.write(u'马克,testing,123\r\n')
f.close()

When opened in Excel without the BOM (\ufeff), I got gibberish, but with the 
BOM the Chinese characters were displayed correctly.

-Mark





More information about the Python-list mailing list