efficient 'tail' implementation

Magnus Lycka lycka at carmen.se
Fri Dec 9 08:03:55 EST 2005


s99999999s2003 at yahoo.com wrote:
> hi
> 
> I have a file which is very large eg over 200Mb , and i am going to use
> python to code  a "tail"
> command to get the last few lines of the file. What is a good algorithm
> for this type of task in python for very big files?
> Initially, i thought of reading everything into an array from the file
> and just get the last few elements (lines) but since it's a very big
> file, don't think is efficient. 
> thanks

To read the last x bytes of a file, you could do:

 >>> import os
 >>> x = 2000 # or whatever...
 >>> f=open('my_big_file')
 >>> l=os.fstat(f.fileno()).st_size
 >>> f.seek(l-x)
 >>> f.read()

Maybe that's a start. I didn't try it on a anything bigger than 16MB,
but it was more or less instantaneous for 16Megs.

If you want the last X lines and know that lines are no more than N
chars, f.seek(l-X*N); f.readlines()[-X:] should give you what you
need... (I think...I didn't test it.)



More information about the Python-list mailing list