Seek the one billionth line in a file containing 3 billion lines.

Terry Reedy tjreedy at udel.edu
Wed Aug 8 17:07:30 EDT 2007


"Marc 'BlackJack' Rintsch" <bj_666 at gmx.net> wrote in message 
news:5htl5qF3md0abU1 at mid.uni-berlin.de...
| On Wed, 08 Aug 2007 09:54:26 +0200, Méta-MCI \(MVP\) wrote:
|
| > Create a "index" (a file with 3,453,299,000 tuples :
| > line_number + start_byte) ; this file has fix-length lines.
| > slow, OK, but once.
|
| Why storing the line number?  The first start offset is for the first
| line, the second start offset for the second line and so on.

Somewhat ironically, given that the OP's problem stems from variable line 
lengths, this requires that the offsets by fixed length.  On a true 64-bit 
OS (not Win64, apparently) with 64-bit ints that would work great. 






More information about the Python-list mailing list