search in textfiles

Alex Martelli aleaxit at yahoo.com
Thu May 10 06:52:00 EDT 2001


"Martin Johansson" <045521104 at telia.com> wrote in message
news:0ftK6.9630$sk3.2623766 at newsb.telia.net...
> Hi!
> Is there any useful modules that I can use for searching for keywords i
> textfiles and then present the textfiles that have the keywords in them on
a
> homepage?

Sure.  cgi.py is a useful module if you want to wrap this "search and
show filenames" functionality as CGI.  HTMLgen, or a zillion other
ones including my tiny little YAPTU
(see http://aspn.activestate.com/ASPN/Python/Cookbook/Recipe/52305)
are useful modules to generate or templatize HTML output from data
your Python program holds.  fileinput is one useful module to read
many files.  re is useful to build a function to tell you "does
this line/file contain any of these keywords", with ease of matching
in a case-independent way if that's what you want.

The core of the desired functionality might a small custom
module such as (call it filefinder.py, for example):

import fileinput
def filesMatching(files, match_func):
    filenames = {}
    for line in fileinput.input(files):
        if match_func(line):
            filenames[fileinput.filename()] = 1
            fileinput.nextfile()
    result = filenames.keys()
    result.sort()
    return result

import re
def matchAnyOf(keywords):
    return re.compile('|'.join(
        map(re.escape,keywords)), re.I
        ).search

def filesContaining(files, keywords):
    return filesMatching(files, matchAnyOf(keywords))

if __name__=='__main__':
    import sys, glob
    result = filesContaining(glob.glob("*.txt"), sys.argv[1:])
    for filename in result:
        print filename

When run in a freshly-unpacked Python 2.1 distribution,
for example:

D:\Python21>python filefinder.py walk getbuildinfo
NEWS.txt
README.txt

D:\Python21>


So your CGI script (or whatever) can import filefinder,
call filefinder.filesContaining with the appropriate
list of files to be searched, and keywords to be looked
for, then present the resulting sorted list of
filenames however it wants to.


Alex






More information about the Python-list mailing list