difflib.ndiff broken?

Tim Peters tim.peters at gmail.com
Thu Jul 15 20:24:34 EDT 2004


[Humpdydum]
> OK, forget it, sorry it was my mistake:

I didn't see a mistake, just a question.

> it wasn't obvious from the difflib docs, but it appears that ndiff points out the
> sub-line differences (lines that start with ?) only if it was able to figure out
> operations that could be applied to substrings on the line. Though often such
> operations are obvious by looking at the strings being compared,

They can be for a program but often aren't for people.  That's why
ndiff produces '?' lines when it thinks they might help.  This is a
heuristic -- a guess.  Sometimes it's not the same guess you'd make. 
There's always a sequence of operations that can be applied to change
any line into any other line, but *usually* they're uninteresting. 
'?' lines attempt to point out "minor edits".

> ndiff doesn't always find them, and so marks the whole line as + or -.

It marks two input lines that differ with - and + regardless of
whether it produces two ? lines too.

> Anyone know of web site that explains ndiff output? I coulnd't figure out a
> good set of search terms in google, didn't get anything useful. Thanks,

ndiff is unique to Python, and you have the source code for it. 
Because '?' lines are fluff, precise docs for them would be
counterproductive.  They're meant to guide the eye to minor intraline
differences, and that's all.

If a ? line appears, there are always two of them, interleaved between
a -+ pair, in this pattern:

-
?
+
?

Each ? line implicitly refers to the line immediately above it.  Four
meaningful characters appear in ? lines.  A caret (^) means the
character immediately above it was replaced, in going from the - to
the + line.  "-" means the character immediately above it was deleted;
'+' means it was inserted; and a blank means the character immediately
above it is the same in both (- and +) lines.  A '-' can appear only
in the ? line following a - line, and a '+' can appear only in the ?
line following a + line, because we're picturing the edits needed to
change the - line into the + line.



More information about the Python-list mailing list