Intersection of multiple lists/list of lists

Thomas Weholt thomas at bibsyst.no
Wed Oct 6 02:57:28 EDT 1999


Aahz Maruch wrote:
> 
> In article <37F9BADF.8B3C4221 at bibsyst.no>,
> Thomas Weholt  <thomas at bibsyst.no> wrote:
> >
> >I got a program were the user will search a database based on keywords.
> 
> I'd recommend using a real database for this.  Suggestions include MySQL
> and the Python-based gadfly.

How, in your opinion, does Berkley DB, using shelve/pickle-methods,
storing simple objects ( this got a little complicated, I mean the
available modules with the standard Python-distro.) , compare to MySQL
or any other DBMS when looking at speed, size and easy of operation? I
read somewhere that Gadfly is more appropriate when the data stored are
small. I`m talking about millions of records in my database. Maybe tens
of millions. ( No, I`m not kidding.) I`m scanning ALL files in my cd-rom
collection, and when I scan all clip-art, picture and source code cds,
that amounts to alot of data. Does Berkley DB cope with this, or do I
have to use something like MySQL? I`d like to avoid SQL and stay with
dictonary-like approaches, most preferrably the solutions provided with
the standard Python-package. I want to store simple objects, with few
attributes ( 20 or less, plain strings or numbers ) and very few methods
and fetch them by id, being a series of numbers like 321.321.32 ( =
cd_id.path_id.file_id ). The amount of objects will be huge, the objects
themselves small. The bottom-line being, do I have to look for something
else than Berkley DB????

And just for the record, I am using Linux. 

Thanks!

Best regards,
Thomas Weholt




More information about the Python-list mailing list