efficient text file search.

Steve Holden steve at holdenweb.com
Mon Sep 11 11:05:31 EDT 2006


noro wrote:
> Bill Scherer wrote:
> 
>>noro wrote:
>>
>>
>>>Is there a more efficient method to find a string in a text file then:
>>>
>>>f=file('somefile')
>>>for line in f:
>>>   if 'string' in line:
>>>        print 'FOUND'
>>>
>>>?
>>>
>>>BTW:
>>>does "for line in f: " read a block of line to te memory or is it
>>>simply calls f.readline() many times?
>>>
>>>thanks
>>>amit
>>>
>>>
>>
>>If your file is sorted by some key in the data, you can build a very
>>fast binary search with mmap in Python.
> 
> 
 > can you add some more info, or point me to a link, i haven't found
 > anything about binary search in mmap() in python documents.
 >
 > the files are very big...
 >
[please don't "top-post": add your latest comments at the end so the 
story reads from the beginning].

I think this is probably not going to help you. A binary search is only 
useful if you want to locate a value in an ordered list. Since your 
original posting made it seem like the text you are looking for could 
appear in any position in any line of the file a binary search doesn't 
do you any good at all (in fact it complicates things and slows them 
down unnecessarily) because you'd still need to look at all lines.

Plus, if the lines are of variable length then you'd need to start by 
creating an index of them, meaning you'd have to go right through the 
file anyway.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb       http://holdenweb.blogspot.com
Recent Ramblings     http://del.icio.us/steve.holden




More information about the Python-list mailing list