[FEEDBACK] Is this script efficient...is there a better way?

Sean 'Shaleh' Perry shalehperry at attbi.com
Wed Sep 11 17:53:23 EDT 2002


On Wednesday 11 September 2002 14:01, Bob X wrote:
>
> # build list of keywords
> kw = [ "some", "words" ]
>
> # loop through the list and print the lines to a file
> for line in inFile.readlines():
>      for badword in kw:
>          if line.find(badword) > -1:
>              result = '%s %s' % (badword, line)
>              print result            # Print the result
>              outFile.write(result)   # Write the result
>
> # close the files
> inFile.close()
> outFile.close()
>
> # let me know when it's done
> print "Finished processing file..."

1) readlines() loads the entire file into a list so if you have a 30+ mb file 
you just ate 30+mb of memory.  Try using xreadlines() instead, it reads the 
file line by line and is much more memory friendly.

2) do you expect to find more than one keyword in a particular line?  If not 
you could save some iterations by stopping the inner line.find() loop as soon 
as one item is found.

As a final comment you need to be aware that the more keywords you look for 
the slower this will be.  However there is not a way to get around that, it 
is just something to keep in mind.




More information about the Python-list mailing list