[Tutor] Faster procedure to filter two lists . Please help

Alan Gauld alan.gauld at freenet.co.uk
Sat Jan 15 10:17:33 CET 2005


> >>>for i in range(len(what)):
> ele = split(what[i],'\t')
> cor1 = ele[0]

It will be faster if you stop using len() in your for loops!

for item in what:

> for k in range(len(my_report)):
> cols = split(my_report[k],'\t')
> cor = cols[0]

And again:
for item in my_report:

> if cor1 == cor:
> print cor+'\t'+ele[1]+'\t'+cols[1]+'\t'+cols[2]

And it will be faster if you stop adding the strings 
together. Each addition creates a new string. Use the 
formatting operation instead:

print "%s\t%s\t%s\t%s" % (cor,ele[1],cols[1],cols[2])

The other thing to consider is using a dictionary.
You are effectively using the cor bit as a key, so if 
instead of creating two lists ytou create two dictionaries
you will avoid all the splitting stuff and have instant access:

for key in what.keys():
  try:
    print "%s\t%s\t%s\t%s" % (key, what[key][0], 
                              report[key][0],report[key][1])
  except KeyError: pass    # or print an error message

By missing out a loop(*) and some splits it should speed 
up significantly for the cost of some small added 
complexity in building the dictionaries in the first case.

(*)In fact 3 loops because you aren't doing len() which 
   effectively loops over the collection too.

HTH,

Alan G
Author of the Learn to Program web tutor
http://www.freenetpages.co.uk/hp/alan.gauld


More information about the Tutor mailing list