[Tutor] mapping header row to data rows in file

Wed Jun 26 18:10:47 CEST 2013

Sivaram Neelakantan wrote:

> I have a file with 100s of columns going thus
> 
> name age sex ....
> AA   23  M ...
> AB   26  M ....
> 
> while I can read the first row as
> 
> header = file.readline().split()
> 
> how do I get to map each col name in header to the subsequent data rows?
> As in
> name = AA
> age = 23
> sex = M
> 
> when processing the first data record and then refreshing it with the 2nd
> data row after I process it in a loop?  Is the zip function, the way to
> go?

zip() is a good starting point if you want to put the rows into dicts:

def reader(instream):
    rows = (line.split() for line in instream)
    names = next(rows)
    return (dict(zip(names, values)) for values in rows)

with open(FILENAME, "r") as f:
    for row in reader(f):
        print row["name"]

If you are sure that the column headers are valid python identifiers you can 
alternatively use a namedtuple:

from collections import namedtuple

def reader(instream):
    rows = (line.split() for line in instream)
    names = next(rows)
    Row = namedtuple("Row", names)
    return (Row(*values) for values in rows)

with open(FILENAME, "r") as f:
    for row in reader(f):
        print row.name

You might also have a look at csv.DictReader.