efficient text file search -solution

Sion Arrowsmith siona at chiark.greenend.org.uk
Tue Sep 12 12:04:49 EDT 2006


noro <amit.man at gmail.com> wrote:
>OK, am not sure why, but
>
>fList=file('somefile').read()
>if fList.find('string') != -1:
>   print 'FOUND'
>
>works much much faster.
>
>it is strange since i thought 'for line in file('somefile')' is
>optemized and read pages to the memory,

Step back and think about what each is doing at a high level of
description: file.read reads the contents of the file into memory
in one go, end of story. file.[x]readlines reads (some or all of)
the contents of the file into memeory, does a linear searches on it
for end of line characters, and copies out the line(s) into some
new bits of memory. Line-by-line processing has a *lot* more work
to do (unless you're read()ing a really big file which is going to
make heavy demands on memory allocation) and it should be no
surprise that it's slower.

-- 
\S -- siona at chiark.greenend.org.uk -- http://www.chaos.org.uk/~sion/
  ___  |  "Frankly I have no feelings towards penguins one way or the other"
  \X/  |    -- Arthur C. Clarke
   her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump



More information about the Python-list mailing list