NEWLINE character problem

Tim Peters tim.one at comcast.net
Fri Jan 30 10:49:11 EST 2004


[Nuff, replying to someone who has strings with various line-end
 conventions]

> Say your string is s; then you could use the following
> function (untested!) to make sure that s uses \n only:
>
>     def fix_lineendings(txt):
>         if txt.count('\r\n'):  # MS DOS
>             txt = txt.replace('\r\n', '\n')
>         elif txt.count('\r'):  # Mac
>             txt = txt.replace('\r', '\n')
>
>         return txt
>
> Simply write: s = fix_lineendings(s)

That's in the right direction.  There's no need to count the *number* of
each kind of oddball, and .count() does all the work that .replace() does
anyway if there aren't any oddballs.  So the one-liner:

    return s.replace('\r\n', '\n').replace('\r', '\n')

does the same, but quicker.  Note that, as an internal optimization,
.replace() doesn't build a new string object unless it finds something to
replace, so it's actually slower to do an "if" test first.

>>> s = 'abcdefghi\n\n'  # doesn't contain any oddballs
>>> t = s.replace('\r\n', '\n').replace('\r', '\n')
>>> s == t
True  # nothing was replaced
>>> s is t
True  # more, the same string object was returned by both .replace()s
>>>





More information about the Python-list mailing list