gather information from various files efficiently

Tue Dec 14 13:25:03 EST 2004

Klaus Neuner wrote:

>Yet, I have got 43 such files. Together they are 4,1M
>large. In the future, they will probably become much larger. 
>At the moment, the process takes several hours. As it is a process
>that I have to run very often, I would like it to be faster. 
>  
>

Others have shown how you can make your dictionary code more efficient, 
which should provide a big speed boost, especially if there are many 
keys in your dicts.

However, if you're taking this long to read files each time, perhaps 
there's a better high-level approach than just a brute-force scan of 
every file every time.  You don't say anything about where those files 
are coming from, or how they're created.  Are they relatively static?  
(That is to say, are they (nearly) the same files being read on each 
run?)  Do you control the process that creates the files?  Given the 
right conditions, you may be able to store your data in a shelve, or 
even proper database, saving you lots of time in parsing through these 
files on each run.  Even if it's entirely new data on each run, you may 
be able to find a more efficient way of transferring data from whatever 
the source is into your program.

Jeff Shannon
Technician/Programmer
Credit International