ignoring or replacing white lines in a diff

Adriaan Renting renting at astron.nl
Thu Jan 14 15:22:23 EST 2016


Maybe someone here has a clue what is going wrong here? Any help is
appreciated.

I'm writing a regression test for a module that generates XML.

I'm using diff to compare the results with a pregenerated one from an
earlier version.

I'm running into two problems:

The diff doesn't seem to behave properly with the -B option. (diff (GNU
diffutils) 2.8.1 on OSX 10.9)

Replacing -B with -I '^[[:space:]]*$' fixes it on the command line,
which should be exactly the same according to:
http://www.gnu.org/software/diffutils/manual/html_node/Blank-Lines.html#Blank-Lines

(for Python problem continue below)

MacRenting 21:00-159> diff -w -B test.xml xml/Ticket_6923.xml
3,5c3,5
<   <version>2.15.0</version>
<   <template version="2.15.0" author="Alwin de Jong,Adriaan Renting"
changedBy="Adriaan Renting">
<   <description>XML Template generator version 2.15.0</description>
---
>           <version>2.6.0</version>
>           <template version="2.6.0" author="Alwin de Jong"
changedBy="Alwin de Jong">
>           <description>XML Template generator version
2.6.0</description>
113d112
<
163d161
<
213d210
<
258d254
<
369d364
<
419d413
<
469d462
<
514d506
<
625d616
<
675d665
<
725d714
<
770d758
<
881d868
<
931d917
<
981d966
<
1026d1010
<
1137d1120
<
1187d1169
<
1237d1218
<
1282d1262
<

/Users/renting/src/CEP4-DevelopClusterModel-Story-Task8432-SAS/XML_generator/test
MacRenting 21:00-160> diff -w -I '^[[:space:]]*$' test.xml
xml/Ticket_6923.xml
3,5c3,5
<   <version>2.15.0</version>
<   <template version="2.15.0" author="Alwin de Jong,Adriaan Renting"
changedBy="Adriaan Renting">
<   <description>XML Template generator version 2.15.0</description>
---
>           <version>2.6.0</version>
>           <template version="2.6.0" author="Alwin de Jong"
changedBy="Alwin de Jong">
>           <description>XML Template generator version
2.6.0</description>


Now I try to use this in Python:

      cmd   = ["diff", "-w", "-I '^[[:space:]]*$'", "./xml/%s.xml" %
name, "test.xml"]
      ## -w ignores differences in whitespace
      ## -I '^[[:space:]]*$' because -B doesn't work for blank lines
(on OSX?)
      p     = subprocess.Popen(cmd, stdin=open('/dev/null'),
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
      logs  = p.communicate()
      diffs = logs[0].splitlines() #stdout
      print "diff reply was %i lines long" % len(diffs)

This doesn't work. I've tried escaping the various bits, like the * and
$, even though with single quotes that should not be needed.

I tried first removing the blank lines from the file:

      import fileinput
      for line in fileinput.FileInput("test.xml",inplace=1):
        if line.rstrip():
          print line

This makes it worse, as it adds and empty line for each line in the
file.

I've tried various other options. The only thing I can think of, is
ditching Python and trying to rewrite the whole script in Bash.
(It's quite complicated, as it loops over various things and does some
pretty output in between and I'm not very fluent in Bash)

Any suggestions?

Thanks for any help provided.

Adriaan Renting.


Adriaan Renting        | Email: renting at astron.nl
Software Engineer Radio Observatory
ASTRON                 | Phone: +31 521 595 100 (797 direct)
P.O. Box 2             | GSM:   +31 6 24 25 17 28
NL-7990 AA Dwingeloo   | FAX:   +31 521 595 101
The Netherlands        | Web: http://www.astron.nl/~renting/





More information about the Python-list mailing list