read file with multiple data per line

Eduardo edumlopes at gmail.com
Tue Apr 14 12:56:08 EDT 2009


On Apr 14, 12:32 am, Steven D'Aprano
<ste... at REMOVE.THIS.cybersource.com.au> wrote:
> On Tue, 14 Apr 2009 00:15:18 -0700, Eduardo wrote:
> > Hello all,
>
> > I googled a lot but couldn't find anything that i could consider a
> > possible solution (though i am fairly new to the language and i think
> > this is the main cause of my failure).
>
> You haven't actually said what the problem is. What are you having
> trouble doing?
>
> --
> Steven

Sorry for that Steven, my main problem is to devise a way to read all
the content of that file into a dictionary or other structure where i
could group atoms by molecule name.

On Apr 14, 4:49 am, Piet van Oostrum <p... at cs.uu.nl> wrote:
> >>>>> Eduardo <edumlo... at gmail.com> (E) wrote:
> >E> Hello all,
> >E> I googled a lot but couldn't find anything that i could consider a
> >E> possible solution (though i am fairly new to the language and i think
> >E> this is the main cause of my failure).
> >E> This is the beginning of the file i have to parse:
> >E>  Modified System
> >E>        32728
> >E>     2NHST1   C1   56   3.263   2.528  16.345
> >E> and this is the end:
> >E>     3.6539    6.4644   20.0000
> >E> This line has 7 formatted fields [5 digits integer, 5 digits
> >E> character, 5 digits character, 5 digits integer, three %8.3f fields]:
> >E>     2NHST1   C1   56   3.263   2.528  16.345
> >E> and this one has 3 %10.4f fields:
> >E>     3.6539    6.4644   20.0000
> >E> Those rules cannot be ignored or the programs i use to simulate and
> >E> analyze the results wont work.
> >E> This file describes the xyz coordinates and atom type of all the atoms
> >E> of the "system" i wish to simulate but i must sort all groups of
> >E> molecules together and that's what i planned to do with a python code.
> >E> I tried to accomplish this task using fortran wich is my main coding
> >E> skills, but it proved to be unstable so i decided to handle files
> >E> using a more apropriate languange while maintaining the number
> >E> crunching tasks written in fortran.
>
> I understand that the first two lines are special and that the third
> line, or the third and fourth lines are repeated.
>
> Something like this will parse the lines. After each line you can
> process the f* variables.
>
> inp = open('testinput', 'rt')
>
> line1 = inp.readline()
> line2 = inp.readline()
>
> for line in inp:
>     line = line.rstrip('\n')
>     if len(line) == 44:
>         f1 = int(line[0:5])
>         f2 = line[5:10]
>         f3 = line[10:15]
>         f4 = int(line[15:20])
>         f5 = float(line[20:28])
>         f6 = float(line[28:36])
>         f7 = float(line[36:44])
>         print f1,f2,f3,f4,f5,f6,f7
>     elif len(line) == 30:
>         f1 = float(line[0:10])
>         f2 = float(line[10:20])
>         f3 = float(line[20:30])
>         print f1,f2,f3
>     else:
>         print("Sorry, I don't understand this format: %s" % line)
>
> --
> Piet van Oostrum <p... at cs.uu.nl>
> URL:http://pietvanoostrum.com[PGP 8DAE142BE17999C4]
> Private email: p... at vanoostrum.org

Thank you very much Piet, i will try your sugestion.




More information about the Python-list mailing list