Running queries on large data structure

Christoph Haas email at christoph-haas.de
Thu Aug 3 10:39:30 EDT 2006


On Wednesday 02 August 2006 22:24, Christoph Haas wrote:
> I have written an application in Perl some time ago (I was young and
> needed the money) that parses multiple large text files containing
> nested data structures and allows the user to run quick queries on the
> data. [...]

I suppose my former posting was too long and concrete. So allow me to try 
it in a different way. :)

The situation is that I have input data that take ~1 minute to parse while 
the users need to run queries on that within seconds. I can think of two 
ways:

(1) Database
    (very quick, but the input data is deeply nested and it would be
     ugly to convert it into some relational shape for the database)
(2) cPickle
    (Read the data every now and then, parse it, write the nested Python
     data structure into a pickled file. The let the other application
     that does the queries unpickle the variable and use it time and
     again.)

So the question is: would you rather force the data into a relational 
database and write object-relational wrappers around it? Or would you 
pickle it and load it later and work on the data? The latter application 
is currently a CGI. I'm open to whatever. :)

Thanks for any enlightenment.

 Christoph



More information about the Python-list mailing list