How to ignore white space changes using difflib?
Grant Edwards
invalid at invalid
Wed Apr 8 11:56:03 EDT 2009
I'm trying to use difflib to compare strings ignoring changes
to white-space (space/tab). According to the doc page, you can
do this by specifying a "charjunk" parameter to filter out
characters:
charjunk: A function that accepts a character (a string of
length 1), and returns if the character is junk, or false if
not. The default is module-level function
IS_CHARACTER_JUNK(), which filters out whitespace characters
(a blank or tab; note: bad idea to include newline in
this!).
But, I simply can't get it to work. I get exactly the same
results with or without white-space filtering:
Here's my test program:
#!/usr/bin/python
import difflib
d1 = ["this is string one","this is string two","this is string three"]
d2 = ["this is string one","this is string two","this is string three"]
def iswhite(c):
return c in " \t"
print "--------------------no filtering--------------------"
delta = difflib.ndiff(d1,d2)
for line in delta:
print line
print "----------------------------------------------------"
print
print "--------------------IS_CHARACTER_JUNK--------------------"
delta = difflib.ndiff(d1,d2,charjunk=difflib.IS_CHARACTER_JUNK)
for line in delta:
print line
print "----------------------------------------------------"
print
print "--------------------iswhite--------------------"
delta = difflib.ndiff(d1,d2,charjunk=iswhite)
for line in delta:
print line
print "----------------------------------------------------"
And here's the output:
--------------------no filtering--------------------
this is string one
- this is string two
? --
+ this is string two
? + +
this is string three
----------------------------------------------------
--------------------IS_CHARACTER_JUNK--------------------
this is string one
- this is string two
? --
+ this is string two
? + +
this is string three
----------------------------------------------------
--------------------iswhite--------------------
this is string one
- this is string two
? --
+ this is string two
? + +
this is string three
----------------------------------------------------
What am I doing wrong?
--
Grant Edwards grante Yow! I'll show you MY
at telex number if you show me
visi.com YOURS ...
More information about the Python-list
mailing list