[Tutor] next element in list
Peter Otten
__peter__ at web.de
Wed Feb 26 13:29:04 CET 2014
rahmad akbar wrote:
> hey guys
>
> i have this file i wish to parse, the file looks something like bellow.
> there are only four entry here (AaaI, AacLI, AaeI, AagI). the complete
> file contains thousands of entries
>
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> REBASE, The Restriction Enzyme Database http://rebase.neb.com
> Copyright (c) Dr. Richard J. Roberts, 2014. All rights reserved.
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
>
> Rich Roberts Jan 30
> 2014
>
> AaaI (XmaIII) C^GGCCG
> AacLI (BamHI) GGATCC
> AaeI (BamHI) GGATCC
> AagI (ClaI) AT^CGAT
>
>
> the strategy was to mark the string 'Rich Roberts' as the start. i wrote
> the following function. but then i realized i couldn't do something like
> .next() to the var in_file which is a list. so i added a flag start =
> False in which will be turned to True upon 'Rich Roberts' found. is the
> any simpler way to move to the next element in the list. like built in
> method or something like that.
>
> def read_bionet(bionetfile):
> res_enzime_dict = {}
> in_file = open(bionetfile, 'r').readlines()
> start = False
> for line in in_file:
> if line.startswith('Rich Roberts'):
> start = True
> if start and len(line) >= 10:
> line = line.split()
> res_enzime_dict[line[0]] = line[-1]
> return res_enzime_dict
As David says, don't call readlines() which reads the lines of the file into
a list, iterate over the file directly:
def read_bionet(bionetfile):
with open(bionetfile) as in_file:
# skip header
for line in in_file:
if line.startswith("Rich Roberts"):
break
# populate dict
res_enzimes = {}
for line in in_file: # continues after the line with R. R.
if len(line) >= 10:
parts = line.split()
res_enzimes[parts[0]] = parts[-1]
# file will be closed now rather than at
# the garbage collector's discretion
return res_enzimes
More information about the Tutor
mailing list