[Tutor] Faster procedure to filter two lists . Please help

Max Noel maxnoel_fr at yahoo.fr
Sat Jan 15 01:04:16 CET 2005


On Jan 14, 2005, at 23:28, kumar s wrote:

>>>> for i in range(len(what)):
> 	ele = split(what[i],'\t')
> 	cor1 = ele[0]
> 	for k in range(len(my_report)):
> 		cols = split(my_report[k],'\t')
> 		cor = cols[0]
> 		if cor1 == cor:
> 			print cor+'\t'+ele[1]+'\t'+cols[1]+'\t'+cols[2]
>
>
> 164:623	6649	TCATGGCTGACAACCCATCTTGGGA	
> 484:11	6687	ATTATCATCACATGCAGCTTCACGC	
> 490:339	6759	GAATGGGGCCGCCAGAACACAGACA	
> 247:57	6880	AGTCCTCGTGGAACTACAACTTCAT	
> 113:623	6901	TCATGGGTGTTCGGCATGACCCCAA	

	Okay, so the idea is, the first column of each row is a key, and you 
want to display only the rows whose key is the first column (key?) of a 
row in my_report, right?

	As Danny said, you should use dictionaries for this, with a structure 
in the lines of:

what = {	'164:623': '6649	TCATGGCTGACAACCCATCTTGGGA',
		'484:11': '6687	ATTATCATCACATGCAGCTTCACGC',
		'490:339': '6759	GAATGGGGCCGCCAGAACACAGACA',
} (etc.)


	Lacking that, as Danny said, nested loops are a huge time sink. Also, 
you should avoid using C-style for loops -- Python-style for loops 
(equivalent to Perl's foreach) are much more elegant (and probably 
faster) in that case. Here's how I would do it with your data 
structures (warning, untested code, test before use):

# First, create a list where each element is one of the keys in 
my_report
# Also, strings have a split() method, which by default splits on any 
whitespace
# (tabs included)
headers = [element.split()[0] for element in my_report]

for element in what:
    # Okay, the nested loop is still (more or less) there, but it occurs 
within a
    # 'in' operator, and is therefore executed in C -- much faster.
    if element.split()[0] in headers:
       print element

	Also, it's shorter -- 4 lines, comments aside. Nevertheless, as Danny 
suggested, an approach using dictionaries would blow this away, 
speed-wise.


Hope that helps,
-- Max
maxnoel_fr at yahoo dot fr -- ICQ #85274019
"Look at you hacker... A pathetic creature of meat and bone, panting 
and sweating as you run through my corridors... How can you challenge a 
perfect, immortal machine?"



More information about the Tutor mailing list