File Compare with difflib.context_diff

Chris Rebert clp2 at rebertia.com
Wed Mar 18 18:33:42 EDT 2009


On Wed, Mar 18, 2009 at 2:30 PM, JohnV <loftmaster at gmail.com> wrote:
> I have a txt file that gets appended with data over a time event.  The
> data comes from an RFID reader and is dumped to the file by the RFID
> software.  I want to poll that file several times over the time period
> of the event to capture the current data in the RFID reader.
>
> When I read the data I want to be able to compare the current data to
> the date from the last time I read it and only process the data
> appended since the last time it was read.
>
> The first time I read the data, it might look like this:
> AU08JEDD011485H14472402210
> AU08JEDD020163C14472502210
> AU08JEDD005029C14480102210
> AU08JEDD004923H14482002210
> AU08AWOL000799H14483902210
>
> The next time it might look like this (with data appended to it)
> AU08JEDD011485H14472402210
> AU08JEDD020163C14472502210
> AU08JEDD005029C14480102210
> AU08JEDD004923H14482002210
> AU08AWOL000799H14483902210
> AU08AWOL000120H14495902210
> AU08ARPU050241H14511702210
> IF08DRTO008074H14520202210
> IF08DRTO008089H14521102210
> IF08DRTO008077H14553602210
> IF08CHES000023H14594902210
>
> What I want to do is compare the old data (lets day it is saved to a
> file called 'lastdata.txt') with the new data (lets day it is saved to
> a file called 'currentdata.txt') and save the new appended data to a
> variable which I HTTP POST to a website where I process the data for
> display to interested parties.  In the example below I am trying to
> save the new appended data to a file called "out.txt"
>
> I have looked at difflib.context_diff but I cannot get the syntax
> correct.  This is what I have taken from an example from this page
> http://docs.python.org/library/difflib.html.  One thing I do not
> understand is what do I do with: fromfile='before.py',
> tofile='after.py' in the example code.
>
> **********
>
> import sys
> import difflib
>
> sys.stdout = open("out.txt","w")
>
>
> f1 = open(r'C:\Users\Owner\Desktop\lastdata.txt', 'r')
> read_data1 = f1.read()
> f1.close()
>
> f2 = open(r'C:\Users\Owner\Desktop\currentdata.txt', 'r')
> read_data2 = f2.read()
> f2.close()
>
> for line in context_diff(read_data1, read_data2, fromfile='before.py',
> tofile='after.py'):
> sys.stdout.write(line)
>
>
> ***************
>
> for line in context_diff(read_data1, read_data2, fromfile='before.py',
> tofile='after.py'): is the line that causes the syntax error.
>
> I would hope that when the script worked that "out.txt" would have the
> appended data.  I would then copy currentdata.txt to lastdata.txt.  No
> need to clear out the data in currentdata.txt as the next dump will
> overwrite that data.
>
> Any help or insights appreciated, thanks...

Completely untested:

from difflib import context_diff

OLD_PATH = r'C:\Users\Owner\Desktop\lastdata.txt'
NEW_PATH = r'C:\Users\Owner\Desktop\currentdata.txt'

out = open("out.txt", 'w')

old = open(OLD_PATH, 'r')
old_lines = list(old)
old.close()

new = open(NEW_PATH, 'r')
new_lines = list(new)
new.close()

for line in context_diff(old_lines, new_lines, fromfile=OLD_PATH,
tofile=NEW_PATH):
    out.write(line)


Cheers,
Chris

-- 
I have a blog:
http://blog.rebertia.com



More information about the Python-list mailing list