efficient 'tail' implementation

bonono at gmail.com bonono at gmail.com
Thu Dec 8 01:43:00 EST 2005


s99999999s2... at yahoo.com wrote:
> hi
>
> I have a file which is very large eg over 200Mb , and i am going to use
> python to code  a "tail"
> command to get the last few lines of the file. What is a good algorithm
> for this type of task in python for very big files?
> Initially, i thought of reading everything into an array from the file
> and just get the last few elements (lines) but since it's a very big
> file, don't think is efficient.
> thanks
I don't think this is a python specific issue but a generic problem for
all "file as byte stream" system. The problem is, "line" is not a
property of the file, but its content(some big iron system use
"records" for lines and can be addressed with O(1))

So the simplest is just read and drop until the one you want.

for x in f:
   if x_is_what_I_want: something

If you really want, you can do the reverse lookup like this :

f.seek(0,EOF)
x = f.tell()

then loop byte by byte backward till you find you stuff. The is quite
cumbersome and may not be faster, depending on your content.




More information about the Python-list mailing list