NEWLINE character problem
Tim Peters
tim.one at comcast.net
Fri Jan 30 10:49:11 EST 2004
[Nuff, replying to someone who has strings with various line-end
conventions]
> Say your string is s; then you could use the following
> function (untested!) to make sure that s uses \n only:
>
> def fix_lineendings(txt):
> if txt.count('\r\n'): # MS DOS
> txt = txt.replace('\r\n', '\n')
> elif txt.count('\r'): # Mac
> txt = txt.replace('\r', '\n')
>
> return txt
>
> Simply write: s = fix_lineendings(s)
That's in the right direction. There's no need to count the *number* of
each kind of oddball, and .count() does all the work that .replace() does
anyway if there aren't any oddballs. So the one-liner:
return s.replace('\r\n', '\n').replace('\r', '\n')
does the same, but quicker. Note that, as an internal optimization,
.replace() doesn't build a new string object unless it finds something to
replace, so it's actually slower to do an "if" test first.
>>> s = 'abcdefghi\n\n' # doesn't contain any oddballs
>>> t = s.replace('\r\n', '\n').replace('\r', '\n')
>>> s == t
True # nothing was replaced
>>> s is t
True # more, the same string object was returned by both .replace()s
>>>
More information about the Python-list
mailing list