getting rid of EOL character ?

Michael Hoffman cam.ac.uk at mh391.invalid
Sat Apr 28 05:25:59 EDT 2007


John Machin wrote:
> On 27/04/2007 11:19 PM, Michael Hoffman wrote:
>> stef wrote:
>>> hello,
>>>
>>> In the previous language I used,
>>> when reading a line by readline, the EOL character was removed.
> 
> Very interesting; how did you distinguish between EOF and an empty line? 
> Did you need to call an isEOF() method before each read?
> 
>>>
>>> Now I'm reading a text-file with CR+LF at the end of each line,
>>>    Datafile = open(filename,'r')    line = Datafile.readline()
>>>
>>> now this gives an extra empty line
>>>    print line
>>>
>>> and what I expect that should be correct, remove CR+LF,
>>> gives me one character too much removed
>>>    print line[,-2]
> 
> Stef, that would give you a syntax error. I presume that you meant to 
> type line[:-2]
> 
>>>
>>> while this gives what I need ???
>>>    print line[,-1]
>>>
>>> Is it correct that the 2 characters CR+LF are converted to 1 character ?
> 
> In text mode (the default), whatever is the line ending on your platform 
> is converted to a single "newline" '\n' which is the same as LF.
> 
> Using line[:-1] is NOT recommended, as the last line in your file may 
> not be terminated, and in that case you would lose the last data character.
> 
>>> Is there a more automatic way to remove the EOL from the string ?
>>
>> line = line.rstrip("\r\n") should take care of it. If you leave out 
>> the parameter, it will strip out all whitespace at the end of the 
>> line, which is what I do in most cases.
> 
> If you want *exactly* what is in the line, use line.rstrip('\n') -- this 
> will remove only the trailing newline (if it exists).
> 
> If you want to strip all trailing whitespace, use line.rstrip() as 
> Michael suggested.
> 
> Michael, note carefully that line.rstrip('\r\n') removes instances of 
> '\r' OR '\n' -- the arg is a set of characters to be removed, not a 
> suffix to be removed. In Stef's situation, it "works" only by accident. 
> Using that would not always give you the correct answer -- e.g. if your 
> (Windows) file had a line ending in CR CR LF [I've seen stranger].

I knew that about line.rstrip, but didn't consider the possibility of 
\r\r\n, while still wanting the first \r. Yuck.

Honestly, I almost always use line.rstrip()--it is seldom that I care 
about closing whitespace.
-- 
Michael Hoffman



More information about the Python-list mailing list