[Tutor] next element in list

rahmad akbar matbioinfo at gmail.com
Wed Feb 26 14:11:53 CET 2014


David, Peter

roger that and thanks so much!!


On Wed, Feb 26, 2014 at 1:29 PM, Peter Otten <__peter__ at web.de> wrote:

> rahmad akbar wrote:
>
> > hey guys
> >
> > i have this file i wish to parse, the file looks something like bellow.
> > there are only four entry here (AaaI, AacLI, AaeI, AagI). the complete
> > file contains thousands of entries
> >
> >     =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> >     REBASE, The Restriction Enzyme Database   http://rebase.neb.com
> >     Copyright (c)  Dr. Richard J. Roberts, 2014.   All rights reserved.
> >     =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> >
> > Rich Roberts                                                    Jan 30
> > 2014
> >
> > AaaI (XmaIII)                     C^GGCCG
> > AacLI (BamHI)                     GGATCC
> > AaeI (BamHI)                      GGATCC
> > AagI (ClaI)                       AT^CGAT
> >
> >
> > the strategy was to mark the string 'Rich Roberts' as the start. i wrote
> > the following function. but then i realized i couldn't do something like
> > .next() to the var in_file which is a list. so i added a flag start =
> > False in which will be turned to True upon 'Rich Roberts' found. is the
> > any simpler way to move to the next element in the list. like built in
> > method or something like that.
> >
> > def read_bionet(bionetfile):
> >   res_enzime_dict = {}
> >   in_file = open(bionetfile, 'r').readlines()
> >   start = False
> >   for line in in_file:
> >     if line.startswith('Rich Roberts'):
> >       start = True
> >     if start and len(line) >= 10:
> >         line = line.split()
> >         res_enzime_dict[line[0]] = line[-1]
> >   return res_enzime_dict
>
> As David says, don't call readlines() which reads the lines of the file
> into
> a list, iterate over the file directly:
>
> def read_bionet(bionetfile):
>     with open(bionetfile) as in_file:
>         # skip header
>         for line in in_file:
>             if line.startswith("Rich Roberts"):
>                 break
>
>         # populate dict
>         res_enzimes = {}
>         for line in in_file: # continues after the line with R. R.
>             if len(line) >= 10:
>                 parts = line.split()
>                 res_enzimes[parts[0]] = parts[-1]
>
>         # file will be closed now rather than at
>         # the garbage collector's discretion
>
>     return res_enzimes
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>



-- 
many thanks
mat
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20140226/608e00e6/attachment.html>


More information about the Tutor mailing list