[issue8702] difflib: unified_diff produces wrong patches (again)

Mark Dickinson report at bugs.python.org
Thu May 13 16:14:58 CEST 2010


Mark Dickinson <dickinsm at gmail.com> added the comment:

I think difflib is behaving as intended here; changing to feature request.

Could you please clarify about the information loss?  I'm not seeing it.  As far as I can tell, the fact that unified_diff produces a list rather than a single string (as GNU diff effectively does) means that all necessary information about newlines is preserved, with no information loss:

newton:py3k dickinsm$ echo -n "one
two" > 1.txt
newton:py3k dickinsm$ echo -n "one
two         
" > 2.txt
newton:py3k dickinsm$ ./python.exe
Python 3.2a0 (py3k:81084:81085M, May 12 2010, 14:16:52) 
[GCC 4.2.1 (Apple Inc. build 5659)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from difflib import unified_diff
[47745 refs]
>>> list(unified_diff(list(open('1.txt')), list(open('2.txt'))))
['--- \n', '+++ \n', '@@ -1,2 +1,2 @@\n', ' one\n', '-two', '+two\n']
[53249 refs]

It looks to me as though the diff picks up the missing newline just fine.

The one problem with the above is that you can't do a ''.join() on it to give a meaningful diff, but I don't see that as a problem with the unified_diff function itself.

I'd be -1 on adding the "\ No newline at end of file" by default, since it complicates the unified_diff format unnecessarily (and would also affect backwards compatibility).  I wouldn't have any objections to an extra option for this, though.

----------
nosy: +mark.dickinson
stage:  -> unit test needed
type: behavior -> feature request

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8702>
_______________________________________


More information about the Python-bugs-list mailing list