a simple 'else' decrease speed by 1000 times.

husam h.jehadalwan at student.kun.nl
Sat Mar 2 06:46:52 EST 2002


Short intro:
One else statement decrease speed by 1000 times. Why is that?

Long intro:

The only thing that this function does is the following:
It takes two dictionaries. The first is a dictionary with 10000 keys. 
The value of each key is a list containing one integer item. The keys of 
the second  dictionary have values that correspond to file names. These 
files must be opened one by one and scaned for the presence of two 
integers, lets call them 'new' and 'previous'. The link between the 
files is that the 'new' integer in file number one represents the 
'previous' integer found in file number two, and so on. If the code 
finds that a 'prv' integer equal to the last item of the list of any 
dict_1[key], that 'prv' ingeger must be appended to the same key.The 
files differ in length. The first one contains 11000 lines, but the 
others has arbitrary length, but they are shorter than the first one in 
general.
The problem is the following: a represantative line from the output of 
this code is like this:

key nr.	key		value
------- 
--- 
	-----
3400 
12.345 
	[45, 90000, 4003, 1203]

the item '1203' is the 'new' ID of '4003', and this is the new ID of 
'90000' and so on. Each of these items are added in one round as a 
result of scanning one file. From the position of the items in the list 
I can tell from which file they were derived.
As you see, the value of a key grows according to the number of files 
processed. But, since the files differ in length, not all the keys will 
grow. The symptom is that when a key value is not modified in the third 
round, simply because there was no match, the new item in the fourth 
round will be added, and therefor the length of the list will be 3 and 
hence I can not tell wether the last item is derived from the fourth 
file or from the third one. For this reason I thought let's add a zero 
to the list when there is no match by using a simple 'else' statement. 
This statement makes the code 1000 times slower. Why is that?

def NewToOld(n,o):

# n = a dictionary with 10000 keys, o = is a dictionary with 3 keys each 
   represents a file name.

      count=1
      files_keys=o.keys()
      files_keys.sort()
      print files_keys
      for key in files_keys:
           print 'Scanning ',o[key],' file....'
           SaveNew = open('rmsd_ranks.pickle','w')

           openfile=open(o[key])
           lines=openfile.readlines()
           openfile.close()
           for i in n.keys():
                for line in lines:
                     words = string.split(line)
                     if len(words) > 2:
                          if words[0]=='G_DATA':
                               prv=int(words[2])
                               new=int(words[1])
                               if prv == n[i][-1]:
                                    n[i].append(new)
                                    print count, '\t', i , '\t',n[i]
                                    count = count +1
                                    break
                               else:
                                    n[i].append(0)

           savednew = cPickle.dump(n, SaveNew)
           SaveNew.close()
      return n




More information about the Python-list mailing list