Parsing by Line Data
Eddie Corns
eddie at holyrood.ed.ac.uk
Thu Jun 17 13:43:56 EDT 2004
python1 <python1 at spamless.net> writes:
>Having slight trouble conceptualizing a way to write this script. The
>problem is that I have a bunch of lines in a file, for example:
>01A\n
>02B\n
>01A\n
>02B\n
>02C\n
>01A\n
>02B\n
>.
>.
>.
>The lines beginning with '01' are the 'header' records, whereas the
>lines beginning with '02' are detail. There can be several detail lines
>to a header.
>I'm looking for a way to put the '01' and subsequent '02' line data into
>one list, and breaking into another list when the next '01' record is found.
>How would you do this? I'm used to using 'readlines()' to pull the file
>data line by line, but in this case, determining the break-point will
>need to be done by reading the '01' from the line ahead. Would you need
>to read the whole file into a string and use a regex to break where a
>'\n01' is found?
def gen_records(src):
rec = []
for line in src:
if line.startswith('01'):
if rec: yield rec
rec = [line]
else:
rec.append(line)
if rec:yield rec
inf = file('input-file')
for record in gen_records (inf):
do_something_to_list (record)
Eddie
More information about the Python-list
mailing list