Is there a faster way to do this?

RPM1 rpm9deleteme at earthlink.net
Tue Aug 5 22:14:10 EDT 2008


ronald.johnson at gmail.com wrote:
> I have a csv file containing product information that is 700+ MB in
> size. I'm trying to go through and pull out unique product ID's only
> as there are a lot of multiples. My problem is that I am appending the
> ProductID to an array and then searching through that array each time
> to see if I've seen the product ID before. So each search takes longer
> and longer. I let the script run for 2 hours before killing it and had
> only run through less than 1/10 if the file.
> 

I think you need to learn about Python's dictionary data type.



More information about the Python-list mailing list