Writing a Carriage Return in Unicode

MRAB python at mrabarnett.plus.com
Wed Nov 18 20:06:33 EST 2009


Doug wrote:
> Hi!
> 
> I am trying to write a UTF-8 file of UNICODE strings with a carriage
> return at the end of each line (code below).
> 
> filOpen = codecs.open("c:\\temp\\unicode.txt",'w','utf-8')
> 
> str1 = u'This is a test.'
> str2 = u'This is the second line.'
> str3 = u'This is the third line.'
> 
> strCR = u"\u240D"
> 
> filOpen.write(str1 + strCR)
> filOpen.write(str2 + strCR)
> filOpen.write(str3 + strCR)
> 
> filOpen.close()
> 
> The output looks like
> This is a test.␍This is the second line.␍This is the third
> line.␍ when opened in Wordpad as a UNICODE file.
> 
> Thanks for your help!!

u'\u240D' isn't a carriage return (that's u'\r') but a symbol (a visible
"CR" graphic) for carriage return. Windows programs normally expect
lines to end with '\r\n'; just use u'\n' in programs and open the text
files in text mode ('r' or 'w').

Some Windows programs won't recognise UTF-8 text as UTF-8 in files
unless they start with a BOM; this will be handled automatically in
Python if you specify the encoding as 'utf-8-sig'.



More information about the Python-list mailing list