text file parsing (awk -> python)
Peter Otten
__peter__ at web.de
Wed Nov 22 07:55:44 EST 2006
Daniel Nogradi wrote:
> I have an awk program that parses a text file which I would like to
> rewrite in python. The text file has multi-line records separated by
> empty lines and each single-line field has two subfields:
>
> node 10
> x -1
> y 1
>
> node 11
> x -2
> y 1
>
> node 12
> x -3
> y 1
>
> and this I would like to parse into a list of dictionaries like so:
>
> mydict[0] = { 'node':10, 'x':-1, 'y':1 }
> mydict[1] = { 'node':11, 'x':-2, 'y':1 }
> mydict[2] = { 'node':12, 'x':-3', 'y':1 }
>
> But the names of the fields (node, x, y) keeps changing from file to
> file, even their number is not fixed, sometimes it is (node, x, y, z).
>
> What would be the simples way to do this?
data = """node 10
x -1
y 1
node 11
x -2
y 1
node 12
x -3
y 1
"""
def open(filename):
from cStringIO import StringIO
return StringIO(data)
converters = dict(
x=int,
y=int
)
def name_value(line):
name, value = line.split(None, 1)
return name, converters.get(name, str.rstrip)(value)
if __name__ == "__main__":
from itertools import groupby
records = []
for empty, record in groupby(open("records.txt"), key=str.isspace):
if not empty:
records.append(dict(name_value(line) for line in record))
import pprint
pprint.pprint(records)
More information about the Python-list
mailing list