Python and stale file handles

bockman at virgilio.it bockman at virgilio.it
Thu Apr 17 08:43:36 EDT 2008


On 17 Apr, 04:22, tgiles <tgi... at gmail.com> wrote:
> Hi, All!
>
> I started back programming Python again after a hiatus of several
> years and run into a sticky problem that I can't seem to fix,
> regardless of how hard I try- it it starts with tailing a log file.
>
> Basically, I'm trying to tail a log file and send the contents
> elsewhere in the script (here, I call it processor()). My first
> iteration below works perfectly fine- as long as the log file itself
> (logfile.log) keeps getting written to.
>
> I have a shell script constantly writes to the logfile.log... If I
> happen to kill it off and restart it (overwriting the log file with
> more entries) then the python script will stop sending anything at all
> out.
>
> import time, os
>
> def processor(message,address):
>         #do something clever here
>
> #Set the filename and open the file
> filename = 'logfile.log'
> file = open(filename,'r')
>
> #Find the size of the file and move to the end
> st_results = os.stat(filename)
> st_size = st_results[6]
> file.seek(st_size)
>
> while 1:
>     where = file.tell()
>     line = file.readline()
>     if not line:
>         time.sleep(1)
>         file.seek(where)
>     else:
>         print line, # already has newline
>         data = line
>         if not data:
>             break
>         else:
>                 processor(data,addr)
>                 print "Sending message '",data,"'....."
>
> someotherstuffhere()
>
> ===
>
> This is perfectly normal behavior since the same thing happens when I
> do a tail -f on the log file. However, I was hoping to build in a bit
> of cleverness in the python script- that it would note that there was
> a change in the log file and could compensate for it.
>
> So, I wrote up a new script that opens the file to begin with,
> attempts to do a quick file measurement of the file (to see if it's
> suddenly stuck) and then reopen the log file if there's something
> dodgy going on.
>
> However, it's not quite working the way that I really intended it to.
> It will either start reading the file from the beginning (instead of
> tailing from the end) or just sit there confuzzled until I kill it
> off.
>
> ===
>
> import time, os
>
> filename = logfile.log
>
> def processor(message):
>     # do something clever here
>
> def checkfile(filename):
>     file = open(filename,'r')
>     print "checking file, first pass"
>     pass1 = os.stat(filename)
>     pass1_size = pass1[6]
>
>     time.sleep(5)
>
>     print "file check, 2nd pass"
>     pass2 = os.stat(filename)
>     pass2_size = pass2[6]
>     if pass1_size == pass2_size:
>         print "reopening file"
>         file.close()
>         file = open(filename,'r')
>     else:
>         print "file is OK"
>         pass
>
> while 1:
>         checkfile(filename)
>     where = file.tell()
>     line = file.readline()
>     print "reading file", where
>     if not line:
>         print "sleeping here"
>         time.sleep(5)
>         print "seeking file here"
>         file.seek(where)
>     else:
>         # print line, # already has newline
>         data = line
>         print "readying line"
>         if not data:
>             print "no data, breaking here"
>             break
>         else:
>             print "sending line"
>             processor(data)
>
> So, have any thoughts on how to keep a Python script from bugging out
> after a tailed file has been refreshed? I'd love to hear any thoughts
> you my have on the matter, even if it's of the 'that's the way things
> work' variety.
>
> Cheers, and thanks in advance for any ideas on how to get around the
> issue.
>
> tom

Possibly, restarting the program that writes the log file creates a
new file rather than
appending to the old one??

I think you should always reopen the file between the first and the
second pass
of your checkfile function, and then:
- if the file has the same size, it is probably the same file (but it
would better to
check the update time!), so seek to the end of it
- otherwise, its a new file, and then start reading it from the
beginning

To reduce the number of seeks, you could perform checkfile only if for
N cycles you did not
get any data.

Ciao
-----
FB



More information about the Python-list mailing list