efficient 'tail' implementation

Gerald Klix Gerald.Klix at klix.ch
Thu Dec 8 02:44:52 EST 2005


As long as memory mapped files are available, the fastest
method is to map the whole file into memory and use the
mappings rfind method to search for an end of line.

The following code snippets may be usefull:
     reportFile = open( filename )
     length = os.fstat( reportFile.fileno() ).st_size
     if length == 0:
         # Don't map zero length files, windows will barf
         continue
     try:
         mapping = mmap.mmap( reportFile.fileno(), length,
                 mmap.MAP_PRIVATE, mmap.PROT_READ )
     except AttributeError:
         mapping = mmap.mmap(
                 reportFile.fileno(),
                 0, None,
                 mmap.ACCESS_READ )

Then you can use
	mapping.rfind( os.linesep )
to find the end of the but last line and so on.

This is very fast, because nearly all work is done by are rfind, which
is implemented in C and the OS' paging logic.

HTH,
Gerald

bonono at gmail.com schrieb:
> Mike Meyer wrote:
> 
>>It would probably be more efficient to read blocks backwards and paste
>>them together, but I'm not going to get into that.
>>
> 
> That actually is a pretty good idea. just reverse the buffer and do a
> split, the last line becomes the first line and so on. The logic then
> would be no different than reading from beginning of file. Just need to
> keep the last "half line" of the reversed buffer if the wanted one
> happens to be across buffer boundary.
> 



More information about the Python-list mailing list