Reading then sending new parts of a log file

Dave Angel davea at ieee.org
Wed Jun 24 17:43:20 EDT 2009


Chuck Connors wrote:
> Hey guys.  I'm trying to work up a little program that will send any
> new lines written to a file (log file from my home automation
> software) to me via instant message.  I've gotten the instant message
> sending part figured out using xmpppy.
>
> I've done a few things with Python in the past but I am in no means
> more than a beginner/hacker.  Can someone tell me what commands/
> modules/etc exist for monitoring and parsing a file for new
> information that I can then send to my IM sending function?  I am not
> asking for anyone to write the code but merely a push in the right
> direction.
>
> Thanks!
>
>   
I assume you have a reason not to put the logic into the program that's 
creating the log file.

Tell us the Python version, and OS it's running on.

So your problem is to monitor a text file, and detect whenever it grows, 
taking the new parts and incrementally doing something with them.  
Here's a fragment from my tail program:

def getRest(options, filename, oldstat, stat, callback):
    more = stat.st_size - oldstat.st_size    #Note:  this could be 
negative, if the file shrank while we were waiting
    if more > 0:
        infile = open(filename, "rb")
        infile.seek(oldstat.st_size)
        buf = infile.read(more)     #BUGBUG  perhaps should break this 
into multiple reads, if over 8k
        callback(buf)       #process the new text

def follow(options, filename, oldstat, callback):
    while True:
        stat = os.stat(filename)
        if stat.st_mtime > oldstat.st_mtime or stat.st_size != 
oldstat.st_size:
            getRest(options, filename, oldstat, stat, callback)
            oldstat = stat
        else:
            time.sleep(options.sec_to_wait)


The concept here is that we only do real work when the stat() of a file 
has changed.  Then, if the size is larger than last time, we process the 
new text.

options is an object with various optional attributes.  In this case, I 
think the only one used was sec_to_wait, which is how long we should 
delay before re-checking the stat.  If it's too small, you waste CPU time.


Your callback will have to deal with  breaking things into messages, 
probably at convenient line-breaks.  And of course the whole thing might 
want to be turned inside out, and coded as a generator.   But it's a 
starting point, as you asked.





More information about the Python-list mailing list