[Tutor] how best to store and process varriable ammounts of paired data

Magnus Lyckå magnus at thinkware.se
Fri Apr 23 06:37:28 EDT 2004


At 13:34 2004-04-22 -0400, Brian van den Broek wrote:
>I'm starting a project to write a bunch of functions for parsing the 
>datafiles of a particular application I use. Thanks to help from the group 
>I now understand how to work with files :-) but I have a question about 
>efficient storage of the information I extract.

Are you just asking about data structures during program
execution, or are you asking about persistent storage?

>So, since there may be 1000's of (id, title) pairs, I am wanting to choose 
>the best method -- best here being defined as some compromise between high 
>speed and small memory footprint.

Thousands doesn't sound very big on modern hardware... As a
better measure, how big are the biggest files you need to
process?

As Don Knuth says, premature optimization is the root of all evil.

Unless the file sizes get bigger than maybe 10% of the amount of
RAM in your machine, I wouldn't worry about performance until I
actually experienced performance problems.

Try to solve the problem in the simplest and most logical way, and
worry about performance if there is a problem. Don't chase ghosts.

Try either of the approaches you suggested, implement it (that
should be fairly quick, and if you do it well, most of the code
will be reusable if you want to change approach) and see if it
works well.

If you get the performance you need, you are done.

It seems to me that a dictionary or a list of (key, value) tuples
is better than two separate lists (I imagine the scenario where
your two lists are suddenly of different lenghts...).

With the list, you retain order of input. With the dictionary,
you get very fast access to data if you know the exact key. You
know what you need, I don't...


--
Magnus Lycka (It's really Lyckå), magnus at thinkware.se
Thinkware AB, Sweden, www.thinkware.se
I code Python ~ The Agile Programming Language 




More information about the Tutor mailing list