[Tutor] File handling: open a file at specified byte?

Mon Feb 20 02:09:25 CET 2006

look at the file tell() and seek() methods.

They will tell you the current location and 
allow you to move to a specific location.

HTH,

Alan G
Author of the learn to program web tutor
http://www.freenetpages.co.uk/hp/alan.gauld

----- Original Message ----- 
From: "Brian Gustin" <brian at daviesinc.com>
To: <tutor at python.org>
Sent: Monday, February 20, 2006 12:18 AM
Subject: [Tutor] File handling: open a file at specified byte?

> HI. This is one I cant seem to find a solid answer on:
> 
> First, some background:
> I have a log file that I wrote a python parser for it, and it works 
> great , but in the interest of saving time and memory , and also to be 
> able to read the currently active log file, say every 10 minutes , and 
> update the static file, I was trying to find some way that I can get 
> python to do this:
> 
> Open log file, read lines up to end of file, and *very important* make a 
> note of the bytes read, and stash this somewhere (I.E. "mark" the file) ,
> and then handle the parsing of said file, until all lines have been read 
> and parsed, write the new files, and close the handler.
>  say, 10 minutes later, for example, the script would then check the 
> bytes read , and *very important* start reading the file *from* the 
> point it marked (I.E. pick up at the point it bookmarked) and read from 
> that point.
> Since the log file will be active (webserver log file) it will have new 
> data to be read, but I dont want to have to read through the *entire* 
> log file all over again, just to get to the new data- I want to be able 
> ot "bookmark" where the log file was read "up to" last time, and then 
> open the file later at that point.
> 
> My current script works well, but only reads the "day old" log file 
> (post log rotate) , and it does very well, parsing as much as 3 GB in as 
> little as 2 minutes if the server isnt heavily loaded when the parser 
> runs.   basically, the webserver runs Tux , which writes a log file for 
> *all* domains on a server, and the script takes the tux log, and parses 
> it, extracting the domain for which the log entry is for, and writes a 
> new line into the domain's apache format CLF log file (this way we are 
> able to run awstats on individual domains, and get relatively accurate 
> stats)
> 
> So.. my question is- is there any way to do what I want ?
> 
> Open a live log file, read lines to x bytes, (say 845673231 bytes) , 
> make a note of this, and 10 miutes later open the same file again *AT* 
> 845673232 bytes - starting with the next byte after the bookmarked 
> point, read to end of file, and update the bookmark.
> 
> 
> Thanks for any pointers- Advice appreciated.
> Bri!
> 
> 
>