DiffLib Question

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Wed May 2 12:32:21 EDT 2007


En Wed, 02 May 2007 06:26:13 -0300, whitewave <fruels at gmail.com> escribió:

>     Thank you for your reply. But I don't fully understand what the
> charjunk and linejunk is all about. I'm a bit newbie in python using
> the DiffLib. I'm I using the right code here? I will I implement the
> linejunk and charjunk using the following code?

Usually, Differ receives two sequences of lines, being each line a  
sequence of characters (strings). It uses a SequenceMatcher to compare  
lines; the linejunk argument is used to ignore certain lines. For each  
pair of similar lines, it uses another SequenceMatcher to compare  
characters inside lines; the charjunk is used to ignore characters.
As you are feeding Differ with a single string (not a list of text lines),  
the "lines" it sees are just characters. To ignore whitespace and  
newlines, in this case one should use the linejunk argument:

def ignore_ws_nl(c):
   return c in " \t\n\r"

a = difflib.Differ(linejunk=ignore_ws_nl).compare(d1,d2)
dif = list(a)
print ''.join(dif)

   I  n     a  d  d  i  t  i  o  n  ,     t  h  e     c  o  n  s  i  d  e   
r  e
d     p  r  o  b  l  e  m     d  o  e  s     n  o  t     h  a  v  e      
a     m
  e  a  n  i  n  g  f  u  l     t  r  a  d  i  t  i  o  n  a  l     t  y   
p  e
   o  f-  +
   a  d  j  o  i  n  t-
+    p  r  o  b  l  e  m     e  v  e  n     f  o  r     t  h  e     s  i   
m  p
l  e     f  o  r  m  s     o  f     t  h  e     d  i  f  f  e  r  e  n  t   
i  a
  l     e  q  u  a  t  i  o  n     a  n  d     t  h  e     n  o  n  l  o   
c  a  l
      c  o  n  d  i  t  i  o  n  s  .     D  u  e-  +
   t  o     t  h  e  s  e     f  a  c  t  s  ,     s  o  m  e     s  e  r   
i  o
u  s     d  i  f  f  i  c  u  l  t  i  e  s     a  r  i  s  e     i  n      
t  h
  e     a  p  p  l  i  c  a  t  i  o  n     o  f     t  h  e     c  l  a   
s  s  i
   c  a  l     m  e  t  h  o  d  s     t  o     s  u  c  h     a-  +
   p  r  o  b  l  e  m  .+

I hope this is what you were looking for.

-- 
Gabriel Genellina



More information about the Python-list mailing list