[New-bugs-announce] [issue2142] naive use of ''.join(difflib.unified_diff(...)) results in bogus diffs with inputs that don't end with end-of-line char

Trent Mick report at bugs.python.org
Mon Feb 18 21:16:57 CET 2008


New submission from Trent Mick:

When comparing content with difflib, if the resulting diff covers the
last line of one or both of the inputs that that line doesn't end with
an end-of-line character(s), then the generated diff lines don't include
an EOL. Fair enough.

Naive (and I suspect typical) usage of difflib.unified_diff(...) is:

  diff = ''.join(difflib.unified_diff(...))

This results in an *incorrect* unified diff for the conditions described
above.

>>> from difflib import *
>>> gen = unified_diff("one\ntwo\nthree".splitlines(1),
...                    "one\ntwo\ntrois".splitlines(1))
>>> print ''.join(gen)
---
+++
@@ -1,3 +1,3 @@
 one
 two
-three+trois


The proper behaviour would be:

>>> gen = unified_diff("one\ntwo\nthree".splitlines(1),
...                    "one\ntwo\ntrois".splitlines(1))
>>> print ''.join(gen)
---
+++
@@ -1,3 +1,3 @@
 one
 two
-three
\ No newline at end of file
+trois
\ No newline at end of file


I *believe* that "\ No newline at end of file" are the appropriate
markers -- that tools like "patch" will know how to use. At least this
is what "svn diff" generates.


I'll try to whip up a patch. 

Do others concur that this should be fixed?

----------
components: Library (Lib)
messages: 62543
nosy: trentm
severity: normal
status: open
title: naive use of ''.join(difflib.unified_diff(...)) results in bogus diffs with inputs that don't end with end-of-line char
type: behavior
versions: Python 2.5

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2142>
__________________________________


More information about the New-bugs-announce mailing list