efficient 'tail' implementation

Nick Craig-Wood nick at craig-wood.com
Thu Dec 8 10:30:04 EST 2005


Gerald Klix <Gerald.Klix at klix.ch> wrote:
>  As long as memory mapped files are available, the fastest
>  method is to map the whole file into memory and use the
>  mappings rfind method to search for an end of line.

Actually mmap doesn't appear to have an rfind method :-(

Here is a tested solution using mmap using your code.  Inefficient if
number of lines to be tailed is too big.

import os
import sys
import mmap

def main(nlines, filename):
     reportFile = open( filename )
     length = os.fstat( reportFile.fileno() ).st_size
     if length == 0:
         # Don't map zero length files, windows will barf
         return
     try:
         mapping = mmap.mmap( reportFile.fileno(), length,
                 mmap.MAP_PRIVATE, mmap.PROT_READ )
     except AttributeError:
         mapping = mmap.mmap(
             reportFile.fileno(),
             0, None,
             mmap.ACCESS_READ )
     search = 1024
     lines = []
     while 1:
         if search > length:
             search = length
         tail = mapping[length-search:]
         lines = tail.split(os.linesep)
         if len(lines) >= nlines or search == length:
             break
         search *= 2
     lines = lines[-nlines-1:]
     print "\n".join(lines)
             
if __name__ == "__main__":
    if len(sys.argv) != 3:
        print "Syntax: %s n file" % sys.argv[0]
    else:
        main(int(sys.argv[1]), sys.argv[2])

-- 
Nick Craig-Wood <nick at craig-wood.com> -- http://www.craig-wood.com/nick



More information about the Python-list mailing list