Parsing ascii file

Eddie Corns eddie at holyrood.ed.ac.uk
Thu Jun 17 07:18:58 EDT 2004


"diablo" <dlicheri at btinternet.com> writes:

>Hello ,

>I have a file that contains the following data (example) and does NOT have
>any line feeds:

>11    22    33    44    55    66    77    88    99    00    aa    bb    cc
>dd  ....to 128th byte     11    22    33    44    55    66    77    88    99
>00    aa    bb    cc    dd .... and so on

>record 1 starts at 0 and finishes at 128, record 2 starts at 129 and
>finishes at 256 and so on. there can be as many as 5000 record per file. I
>would like to parse the file and retreive the value at field at byte 64-65
>and conduct an arithmetical operation on the field (sum them all up).

>Can I do this with python?

>if I was to use awk it would look something like this :

>cat <filename> | fold -w 128 | awk ' { SUM=SUM + substr($0,64,2) } END
>{print SUM}'

You can use stdin.read(128) to get consecutive records and slicing to extract
the fields.  Something like:

from sys import stdin
sum = 0
while True:
    record = stdin.read(128)
    if not record: break
    sum += int(record[64:65])
print sum

Frankly, I'd stick with the Awk version unless it's a pedagogical exercise.
Actually I'd go further and have a script that simplys sums up all the numbers
in the input and add 'cut' into the pipeline to extract the columns first.

Eddie



More information about the Python-list mailing list