negative lookahead question
Skip Montanaro
skip at pobox.com
Mon Apr 21 12:24:34 EDT 2003
This re.sub call lives in Lib/stmplib.py as a way to make line endings
canonical:
re.sub(r'(?:\r\n|\n|\r(?!\n))', CRLF, data)
This certainly seems to do what's desired, however, it looks overly complex
to me. First, the non-grouping parens are unnecessary. Second, I don't
think the negative lookahead assertion is required. This simpler function
call seems to do the trick:
re.sub(r'\r\n|\n|\r', CRLF, data)
A simple test case containing a combination of different line endings seems
to yield identical results:
>>> data = 'line 0\r\nline 1\nline 2\rline 3\r\r\nline 4\n'
>>> re.sub(r'\r\n|\n|\r(?!\n)',CRLF,data) == re.sub(r'\r\n|\n|\r',CRLF,data)
True
Is there a case where the negative lookahead assertion will produce correct
results but the simpler regular expression won't?
(FYI, this isn't a performance question, but a readability question.)
Thx,
Skip
More information about the Python-list
mailing list