Irregular last line in a text file, was Re: Regular expressions

Ian Kelly ian.g.kelly at gmail.com
Tue Nov 3 13:39:32 EST 2015


On Tue, Nov 3, 2015 at 11:33 AM, Ian Kelly <ian.g.kelly at gmail.com> wrote:
> On Tue, Nov 3, 2015 at 9:56 AM, Tim Chase <python.list at tim.thechases.com> wrote:
>> Or even more valuable to me:
>>
>>   with open(..., newline="strip") as f:
>>     assert all(not line.endswith(("\n", "\r")) for line in f)
>>
>> because I have countless loops that look something like
>>
>>   with open(...) as f:
>>     for line in f:
>>       line = line.rstrip('\r\n')
>>       process(line)
>
> What would happen if you read a file opened like this without
> iterating over lines?

I think I'd go with this:

>>> def strip_newlines(iterable):
...     for line in iterable:
...         yield line.rstrip('\r\n')
...
>>> list(strip_newlines(['one\n', 'two\r', 'three']))
['one', 'two', 'three']

Or if I care about optimizing the for loop (but we're talking about
file I/O, so probably not), this might be faster:

>>> import operator
>>> def strip_newlines(iterable):
...     return map(operator.methodcaller('rstrip', '\r\n'), iterable)
...
>>> list(strip_newlines(['one\n', 'two\r', 'three']))
['one', 'two', 'three']

Then the iteration is just:
    for line in strip_newlines(f):



More information about the Python-list mailing list