List performance and CSV
Stephan
usenet.filter at gmail.com
Sat Oct 8 11:19:43 EDT 2005
Hello,
I'm working on a simple project in Python that reads in two csv files
and compares items in one file with items in another for matches. I
read the files in using the csv module, adding each line into a list.
Then I run the comparision on the lists. This works fine, but I'm
curious about performance.
Here's the main part of my code:
######
file1 = open("CustomerList.csv")
CustomerList = csv.reader(file1)
Customers = []
#Read in the contents of the CSV file into memory
for CustomerRecord in CustomerList:
Customers.append(CustomerRecord)
#not shown here: the second file CustomersToMatch
#is loaded in a similar manner
#loop through each record and find matches on column 2
#breaking out of inner loop when a match is found
for loop1 in range(len(CustomersToMatch)):
for loop2 in range(len(Customers)):
if (CustomersToMatch[loop1][2] == Customers[loop2][2]) :
CustomersToMatch[loop1][1] = Customers[loop2][1]
break
######
With this code, it takes roughly 10 minutes on a 2Ghz x86 box to
compare two lists of 20,000 records. Is that good? Out of curiousity,
I tried psyco and saw no difference. Is there a better Python synax to
use?
Thanks,
-Stephan
More information about the Python-list
mailing list