efficient text file search.

Mon Sep 11 07:54:18 EDT 2006

"noro" <amit.man at gmail.com> schreef in bericht 
news:1157974937.286653.5500 at h48g2000cwc.googlegroups.com...
> :)
>
> via python...
>
> Luuk wrote:
>> "noro" <amit.man at gmail.com> schreef in bericht
>> news:1157973527.817462.207420 at h48g2000cwc.googlegroups.com...
>> > Is there a more efficient method to find a string in a text file then:
>> >
>> > f=file('somefile')
>> > for line in f:
>> >    if 'string' in line:
>> >         print 'FOUND'
>> >
>>
>>
>> yes, more efficient would be:
>> grep (http://www.gnu.org/software/grep/)
>

ok, a more serious answer:

some googling turned op the following.
Second paragraph of chapter 14 of http://www.amk.ca/python/2.1/

a.. The speed of line-oriented file I/O has been improved because people 
often complain about its lack of speed, and because it's often been used as 
a naïve benchmark. The readline() method of file objects has therefore been 
rewritten to be much faster. The exact amount of the speedup will vary from 
platform to platform depending on how slow the C library's getc() was, but 
is around 66%, and potentially much faster on some particular operating 
systems. Tim Peters did much of the benchmarking and coding for this change, 
motivated by a discussion in comp.lang.python.
A new module and method for file objects was also added, contributed by Jeff 
Epler. The new method, xreadlines(), is similar to the existing xrange() 
built-in. xreadlines() returns an opaque sequence object that only supports 
being iterated over, reading a line on every iteration but not reading the 
entire file into memory as the existing readlines() method does. You'd use 
it like this:

for line in sys.stdin.xreadlines():
    # ... do something for each line ...
    ...
For a fuller discussion of the line I/O changes, see the python-dev summary 
for January 1-15, 2001 at http://www.amk.ca/python/dev/2001-01-1.html.