Extreme newbie.. looking for a push in the right direction. Need to find text in a file and then display

Hans Nowak hnowak at cuci.nl
Tue Aug 28 04:47:47 EDT 2001


>===== Original Message From toflatpy2 at oaktown.org (toflat) =====
>Hey There,
>
>I still pretty new to this and hope this question isn't too lame. I've
>been digging around ( and learning) the documentation but just can
>figure out how to do this.
>
>I have a file with a repeating predictable format.
>
>ie:
>
><name>
>fname lname
>1
>name
>address
>whatever
>yada..
></end name> 12345
>
>So.. the record format is always:
>Line 1. opening <name> tag
>Line 2. record title (most often a first and last name) the opening
><name> tag, Line 3. a number (this number has no significance)
>Lines 4-?. the record information.
>Last Line. </end name> 12345
>
>I would like to write a script to search the file for a name and then
>display the record.
>
>So it would...
>A> search for the name (record title).
>B> upon finding a match, the line containing the match is saved
>(record title)
>B> replace this number with a blank line in results
>C> each following line is saved (added to record) up to but not
>including...
>D> the end marker (</end name> 12345)
>
>The script then displays the result to the terminal
>
>ex:
>
>searching the file for a record with "o'maley" in the title.
>source file:
>
><name>
>Frank O'Maley
>1
>Frank O'Maley
>1234 VeryDark Alley
>Dublin, Ireland
>011 123 45 45 9
>yada yada
></end name> 12345
>
>Would return:
>
>Frank O'Maley
>
>Frank O'Maley
>1234 VeryDark Alley
>Dublin, Ireland
>011 123 45 45 9
>yada yada
>
>This is I've been pouring over the documentation and think I
>understand how to use
>the re module to find what I'm looking for. However I'm at a loss when
>it come to
>actually using what it finds. I don't know how I can get it to save
>the line that contains
>the search result. And I do not know how to get it to keep adding
>lines (and modifying) until
>it comes to the end marker.

You probably won't need the re module. First you need to find <name> tags; 
this can be done by looping over the list of lines and ignoring everything but 
the tag. Upon finding one, you read the next line (which contains the first 
and last name) and check it for the name you're looking for.

A very crude example to get you going:

f = open("file", "rb")
lines = f.readlines()
f.close()

for i in range(len(lines)):
    line = lines[i]
    if line.strip() == "<name>":
        # Aha, found one
        nextline = lines[i+1]
        if match(line, NAME):
            # this record starts at line i, which can be saved
            # print the rest of the lines

You will have to write the match function yourself, because I don't know 
exactly on what criteria you are searching and matching. Here's a simple 
example:

>>> import string
>>> def match(line, name):
	idx = string.find(line.lower(), name.lower())
	return (idx >= 0)

>>> match("Fred O'Hara", "o'hara")
1
>>> match("John McDouglas", "dougie")
0

(Yes, I like to mix the string module and string methods, aside... ;-)

HTH,

--Hans Nowak





More information about the Python-list mailing list