read file with multiple data per line

Piet van Oostrum piet at cs.uu.nl
Tue Apr 14 07:49:35 EDT 2009


>>>>> Eduardo <edumlopes at gmail.com> (E) wrote:

>E> Hello all,
>E> I googled a lot but couldn't find anything that i could consider a
>E> possible solution (though i am fairly new to the language and i think
>E> this is the main cause of my failure).

>E> This is the beginning of the file i have to parse:

>E>  Modified System
>E>        32728
>E>     2NHST1   C1   56   3.263   2.528  16.345

>E> and this is the end:

>E>     3.6539    6.4644   20.0000

>E> This line has 7 formatted fields [5 digits integer, 5 digits
>E> character, 5 digits character, 5 digits integer, three %8.3f fields]:

>E>     2NHST1   C1   56   3.263   2.528  16.345

>E> and this one has 3 %10.4f fields:
>E>     3.6539    6.4644   20.0000

>E> Those rules cannot be ignored or the programs i use to simulate and
>E> analyze the results wont work.

>E> This file describes the xyz coordinates and atom type of all the atoms
>E> of the "system" i wish to simulate but i must sort all groups of
>E> molecules together and that's what i planned to do with a python code.
>E> I tried to accomplish this task using fortran wich is my main coding
>E> skills, but it proved to be unstable so i decided to handle files
>E> using a more apropriate languange while maintaining the number
>E> crunching tasks written in fortran.

I understand that the first two lines are special and that the third
line, or the third and fourth lines are repeated.

Something like this will parse the lines. After each line you can
process the f* variables.

inp = open('testinput', 'rt')

line1 = inp.readline()
line2 = inp.readline()

for line in inp:
    line = line.rstrip('\n')
    if len(line) == 44:
        f1 = int(line[0:5])
        f2 = line[5:10]
        f3 = line[10:15]
        f4 = int(line[15:20])
        f5 = float(line[20:28])
        f6 = float(line[28:36])
        f7 = float(line[36:44])
        print f1,f2,f3,f4,f5,f6,f7
    elif len(line) == 30:
        f1 = float(line[0:10])
        f2 = float(line[10:20])
        f3 = float(line[20:30])
        print f1,f2,f3
    else:
        print("Sorry, I don't understand this format: %s" % line)

-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: piet at vanoostrum.org



More information about the Python-list mailing list