extract certain values from file with re
Scott David Daniels
scott.daniels at acm.org
Fri Oct 6 13:01:50 EDT 2006
bearophileHUGS at lycos.com wrote:
<a fine solution component for the second problem>
Use his solution like:
datafile = open(data_file_name, 'r')
for line in datafile:
if 'U-Mom' in line:
print float(line.split("|")[4])
datafile.close()
For the earlier problem:
def data_specific(source):
global headings # in case some other bit wants to read them
saw_top = False
gen = iter(source)
for line in gen:
cut = line.split(None, 1)
if len(cut) > 1 and (cut[0] == 'ITER'
and 'GLOBAL ABSOLUTE RESIDUAL' in cut[1]):
break
else:
return
headings = gen.next().split() # column headings
starts = range(11, 74, 9) + range(75, 138, 9) # for fixed-width
for line in gen:
data = line.split()
if data and data != ['...']: # suppress blank lines
if data[0] == '&&&&&&': # found the terminator
break
assert line[10] == ' ' and line[74] == ' '
yield [int(line[:10])] + [
float(line[n : n+9]) for n in starts]
datafile = open(data_file_name, 'r')
for row in data_specific(datafile):
print row # or row[headings.index('MASS')] or whatever
datafile.close()
The general theme here is: don't use re unless it is a good solution.
sometimes you know which columns things are in, sometimes you know a
separator, sometimes there is a mix, and sometimes you do need a regular
expression. Save re for when you need to do pattern matching.
--Scott David Daniels
scott.daniels at acm.org
More information about the Python-list
mailing list