Python and stale file handles

tgiles tgiles at gmail.com
Wed Apr 16 22:22:00 EDT 2008


Hi, All!

I started back programming Python again after a hiatus of several
years and run into a sticky problem that I can't seem to fix,
regardless of how hard I try- it it starts with tailing a log file.

Basically, I'm trying to tail a log file and send the contents
elsewhere in the script (here, I call it processor()). My first
iteration below works perfectly fine- as long as the log file itself
(logfile.log) keeps getting written to.

I have a shell script constantly writes to the logfile.log... If I
happen to kill it off and restart it (overwriting the log file with
more entries) then the python script will stop sending anything at all
out.

import time, os

def processor(message,address):
	#do something clever here

#Set the filename and open the file
filename = 'logfile.log'
file = open(filename,'r')

#Find the size of the file and move to the end
st_results = os.stat(filename)
st_size = st_results[6]
file.seek(st_size)

while 1:
    where = file.tell()
    line = file.readline()
    if not line:
        time.sleep(1)
        file.seek(where)
    else:
        print line, # already has newline
        data = line
        if not data:
            break
        else:
        	processor(data,addr)
                print "Sending message '",data,"'....."

someotherstuffhere()

===

This is perfectly normal behavior since the same thing happens when I
do a tail -f on the log file. However, I was hoping to build in a bit
of cleverness in the python script- that it would note that there was
a change in the log file and could compensate for it.

So, I wrote up a new script that opens the file to begin with,
attempts to do a quick file measurement of the file (to see if it's
suddenly stuck) and then reopen the log file if there's something
dodgy going on.

However, it's not quite working the way that I really intended it to.
It will either start reading the file from the beginning (instead of
tailing from the end) or just sit there confuzzled until I kill it
off.

===


import time, os

filename = logfile.log

def processor(message):
    # do something clever here

def checkfile(filename):
    file = open(filename,'r')
    print "checking file, first pass"
    pass1 = os.stat(filename)
    pass1_size = pass1[6]

    time.sleep(5)

    print "file check, 2nd pass"
    pass2 = os.stat(filename)
    pass2_size = pass2[6]
    if pass1_size == pass2_size:
        print "reopening file"
        file.close()
        file = open(filename,'r')
    else:
        print "file is OK"
        pass



while 1:
	checkfile(filename)
    where = file.tell()
    line = file.readline()
    print "reading file", where
    if not line:
        print "sleeping here"
        time.sleep(5)
        print "seeking file here"
        file.seek(where)
    else:
        # print line, # already has newline
        data = line
        print "readying line"
        if not data:
            print "no data, breaking here"
            break
        else:
            print "sending line"
            processor(data)

So, have any thoughts on how to keep a Python script from bugging out
after a tailed file has been refreshed? I'd love to hear any thoughts
you my have on the matter, even if it's of the 'that's the way things
work' variety.

Cheers, and thanks in advance for any ideas on how to get around the
issue.

tom



More information about the Python-list mailing list