dict would be very slow for big data

Steven D'Aprano steven at REMOVE.THIS.cybersource.com.au
Tue May 12 04:31:09 EDT 2009


On Mon, 11 May 2009 20:28:13 -0700, forrest yang wrote:

> hi
> i am trying to insert a lot of data into a dict, which may be 10,000,000
> level.
> after inserting 100000 unit, the insert rate become very slow, 50,000/
> s, and the entire time used for this task would be very long,also. would
> anyone know some solution for this case?

You don't give us enough information to answer.


How are you generating the data?

What are the keys and the values?

Are you sure it is slow to insert into the dict, or could some other part 
of your processing be slow?

Does performance change if you turn garbage collection off?

import gc
gc.disable()
# insert your items
gc.enable()


Can you show us a sample of the data, and the code you are using to 
insert it into the dict?

Do you have enough memory? If you run an external process like top, can 
you see memory exhaustion, CPU load or some other problem?




-- 
Steven



More information about the Python-list mailing list