ignoring or replacing white lines in a diff

Martin A. Brown martin at linux-ip.net
Thu Jan 14 15:57:29 EST 2016


Hello Adriaan,

>Maybe someone here has a clue what is going wrong here? Any help is 
>appreciated.

Have you tried out this tool that does precisely what you need? to 
do yourself?

  https://pypi.python.org/pypi/xmldiff

I can't vouch specifically for it, am simply a user, but I know that 
I have used it happily in the past.  (Other CLI tools, include 
non-Python tools, such as xmllint, which can produce a predictable, 
reproducible XML formatting, too.)

>I'm writing a regression test for a module that generates XML.

Very good.  Good == Testing.

>I'm using diff to compare the results with a pregenerated one from an
>earlier version.

[
Interesting.  I can only speculate randomly about the whitespace 
issue.  Have you examined (with the CLI tools hexdump, od or your 
favorite byte dumper) the two different XML outputs?
]

Back to the lands of Python

>      cmd   = ["diff", "-w", "-I '^[[:space:]]*$'", "./xml/%s.xml" % name, "test.xml"]

It looks like a quoting issue.  I think you are passing the 
following tokens to your OS.  You should be able to run your Python 
program under a system call tracer to see what is actually getting 
exec()d.

I'm accustomed to using strace, but it seems that Macintosh uses 
dtruss.  Anyway, I think your cmd is turning into this (as for as 
your kernel is concerned):

   token 1: diff
   token 2: -w
   token 3: -I '^[[:space:]]*$'
   token 4: ./xml/name.xml
   token 5: test.xml

Try this (untested):

>      cmd = ["diff", "-w", "-I", "^[[:space:]]*$", "./xml/%s.xml" % name, "test.xml"]

But, perhaps the xmldiff module will be what you want.

-Martin

-- 
Martin A. Brown
http://linux-ip.net/



More information about the Python-list mailing list