ascii to unicode line endings

Marc 'BlackJack' Rintsch bj_666 at gmx.net
Thu May 3 10:36:34 EDT 2007


In <1178196326.926359.187930 at p77g2000hsh.googlegroups.com>, fidtz wrote:

>>>> import codecs
>>>> testASCII = file("c:\\temp\\test1.txt",'w')
>>>> testASCII.write("\n")
>>>> testASCII.close()
>>>> testASCII = file("c:\\temp\\test1.txt",'r')
>>>> testASCII.read()
> '\n'
> Bit pattern on disk : \0x0D\0x0A
>>>> testASCII.seek(0)
>>>> testUNI = codecs.open("c:\\temp\\test2.txt",'w','utf16')
>>>> testUNI.write(testASCII.read())
>>>> testUNI.close()
>>>> testUNI = file("c:\\temp\\test2.txt",'r')
>>>> testUNI.read()
> '\xff\xfe\n\x00'
> Bit pattern on disk:\0xff\0xfe\0x0a\0x00
> Bit pattern I was expecting:\0xff\0xfe\0x0d\0x00\0x0a\0x00
>>>> testUNI.close()

Files opened with `codecs.open()` are always opened in binary mode.  So if
you want '\n' to be translated into a platform specific character sequence
you have to do it yourself.

Ciao,
	Marc 'BlackJack' Rintsch



More information about the Python-list mailing list