Python and stale file handles
bockman at virgilio.it
bockman at virgilio.it
Thu Apr 17 08:43:36 EDT 2008
On 17 Apr, 04:22, tgiles <tgi... at gmail.com> wrote:
> Hi, All!
>
> I started back programming Python again after a hiatus of several
> years and run into a sticky problem that I can't seem to fix,
> regardless of how hard I try- it it starts with tailing a log file.
>
> Basically, I'm trying to tail a log file and send the contents
> elsewhere in the script (here, I call it processor()). My first
> iteration below works perfectly fine- as long as the log file itself
> (logfile.log) keeps getting written to.
>
> I have a shell script constantly writes to the logfile.log... If I
> happen to kill it off and restart it (overwriting the log file with
> more entries) then the python script will stop sending anything at all
> out.
>
> import time, os
>
> def processor(message,address):
> #do something clever here
>
> #Set the filename and open the file
> filename = 'logfile.log'
> file = open(filename,'r')
>
> #Find the size of the file and move to the end
> st_results = os.stat(filename)
> st_size = st_results[6]
> file.seek(st_size)
>
> while 1:
> where = file.tell()
> line = file.readline()
> if not line:
> time.sleep(1)
> file.seek(where)
> else:
> print line, # already has newline
> data = line
> if not data:
> break
> else:
> processor(data,addr)
> print "Sending message '",data,"'....."
>
> someotherstuffhere()
>
> ===
>
> This is perfectly normal behavior since the same thing happens when I
> do a tail -f on the log file. However, I was hoping to build in a bit
> of cleverness in the python script- that it would note that there was
> a change in the log file and could compensate for it.
>
> So, I wrote up a new script that opens the file to begin with,
> attempts to do a quick file measurement of the file (to see if it's
> suddenly stuck) and then reopen the log file if there's something
> dodgy going on.
>
> However, it's not quite working the way that I really intended it to.
> It will either start reading the file from the beginning (instead of
> tailing from the end) or just sit there confuzzled until I kill it
> off.
>
> ===
>
> import time, os
>
> filename = logfile.log
>
> def processor(message):
> # do something clever here
>
> def checkfile(filename):
> file = open(filename,'r')
> print "checking file, first pass"
> pass1 = os.stat(filename)
> pass1_size = pass1[6]
>
> time.sleep(5)
>
> print "file check, 2nd pass"
> pass2 = os.stat(filename)
> pass2_size = pass2[6]
> if pass1_size == pass2_size:
> print "reopening file"
> file.close()
> file = open(filename,'r')
> else:
> print "file is OK"
> pass
>
> while 1:
> checkfile(filename)
> where = file.tell()
> line = file.readline()
> print "reading file", where
> if not line:
> print "sleeping here"
> time.sleep(5)
> print "seeking file here"
> file.seek(where)
> else:
> # print line, # already has newline
> data = line
> print "readying line"
> if not data:
> print "no data, breaking here"
> break
> else:
> print "sending line"
> processor(data)
>
> So, have any thoughts on how to keep a Python script from bugging out
> after a tailed file has been refreshed? I'd love to hear any thoughts
> you my have on the matter, even if it's of the 'that's the way things
> work' variety.
>
> Cheers, and thanks in advance for any ideas on how to get around the
> issue.
>
> tom
Possibly, restarting the program that writes the log file creates a
new file rather than
appending to the old one??
I think you should always reopen the file between the first and the
second pass
of your checkfile function, and then:
- if the file has the same size, it is probably the same file (but it
would better to
check the update time!), so seek to the end of it
- otherwise, its a new file, and then start reading it from the
beginning
To reduce the number of seeks, you could perform checkfile only if for
N cycles you did not
get any data.
Ciao
-----
FB
More information about the Python-list
mailing list