[Tutor] unicode utf-16 and readlines [using the 'codecs' unicode file reading module]

Poor Yorick gp@pooryorick.com
Tue Jan 7 19:39:05 2003


Danny Yoo wrote:

>
>
>
>I have to admit I'm a bit confused; there shouldn't be any automatic
>handling of newlines when we use read(), since read() sucks all the text
>out of a file.
>
>Can you explain more what you mean by automatic newline handling?  Do you
>mean a conversion of '\r\n' to '\n'?
>
>
>
As you mentioned, strip works correctly with the list items returned by 
codec.readlines(), so my problem is entirely resolved.  Yes, I meant 
that codecs.readlines returns '\r\n' where a standard file object 
returns just '\n':

 >>> import codecs
 >>> fh = codecs.open('0022data2.txt', 'r', 'utf-16')
 >>> a = fh.readlines()
 >>> a
[u'\u51fa\r\n']
 >>> fh = open('test1.txt', 'r')
 >>> a = fh.readlines()
 >>> a
['hello\n', 'goodbye\n', 'where\n', 'how\n', 'when']
 >>>

Perhaps  you could tell me if this inconsistency poses any implications 
for the Python programmer.

Poor Yorick
gp@pooryorick.com