[Tutor] Problem When Iterating Over Large Test Files

Lee Harr missive at hotmail.com
Thu Jul 19 05:23:22 CEST 2012


>   grep ^TTCTGTGAGTGATTTCCTGCAAGACAGGAATGTCAGT$> with no results

How about:
grep TTCTGTGAGTGATTTCCTGCAAGACAGGAATGTCAGT outfile
Just in case there is some non-printing character in there...

Beyond that ... my guess would be that you are either not readingthe file you think you are, or not writing the file you think you are  :o)
out = each.replace('/gzip', '/rem_clusters2')
Seems pretty bulletproof, but maybe just print each and out hereto make sure...

Also, I'm curious... Reading your code, I sort of feel like when I amlistening to a non-native speaker. I always get the urge to throw out thecorrect "Americanisms" for people -- to help them fit in better. So, I hope itdoes not make me a jerk, but ...
infile = open(each, 'r') # I'd probably drop the 'r' also...
while not check_for_end_of_file:
reads += 1
head, sep, tail = id_line_1.partition(' ') # or, if I'm only using the one thing ..._, _, meaningful_name = id_line_1.partition(' ') # maybe call it "selector", then ...
if selector in ('1:N:0:', '2:N:0:'):

Hope this helps.
 		 	   		  


More information about the Tutor mailing list