how to write unicode to a txt file?

Peter Otten __peter__ at web.de
Wed Jan 17 11:36:40 EST 2007


Frank Potter wrote:

> I want to change an srt file to unicode format so mpalyer can display
> Chinese subtitles properly.
> I did it like this:
> 
> txt=open('dmd-guardian-cd1.srt').read()
> txt=unicode(txt,'gb18030')
> open('dmd-guardian-cd1.srt','w').write(txt)
> 
> But it seems that python can't directly write unicode to a file,
> I got and error at the 3rd line:
> UnicodeEncodeError: 'ascii' codec can't encode characters in position
> 85-96: ordinal not in range(128)
> 
> How to save the unicode string to the file, please?
> Thanks!

You have to tell Python what encoding to use (i. e how to translate the
codepoints into bytes):

>>> txt = u"ähnlicher als gewöhnlich üblich"
>>> import codecs
>>> codecs.open("tmp.txt", "w", "utf8").write(txt)
>>> codecs.open("tmp.txt", "r", "utf8").read()
u'\xe4hnlicher als gew\xf6hnlich \xfcblich'

You would perhaps use 'gb18030' instead of 'utf8'.

Peter





More information about the Python-list mailing list