best small database?

Larry Bates larry.bates at websafe.com
Mon Sep 11 19:27:41 EDT 2006


Blair P. Houghton wrote:
> Larry Bates wrote:
>> The filesystem is almost always the
>> most efficient place to store files, not as blobs in a
>> database.
> 
> I could get all theoretical about why that's not so in most cases,
> but there are plenty of cases where it is so (especially when the
> person doing the DB doesn't get the idea behind all filesystems,
> which is that they are themselves simplified databases), so
> I won't*.
> 
> In this case, the filesystem may be the best place to
> do the work, because it's the cheapest to implement
> and maintain.
> 
> --Blair
> 
> * - okay, I will
> 
> 1.  Since the filesystem is a database, making accesses
> to it after being directed there by a database means you're
> using two database systems (and an intervening operating
> system)  to do one thing.  Serious databases work from
> disks with no filesystem to get rid of that extra layer entirely.
> But there are benefits to having things in files reachable by
> ordinary tools, and to having the OS mediating access to
> the data, but you need to be sure you need those benefits
> and can afford the overhead.  Academic in most cases,
> including the one that started this thread.
> 
> 2.  When using the filesystem as the database
> you only get one kind of native association, and have to
> use semantics in the directory and filenames to give you
> hints as to the type stored at a particular location.  You get a
> few pieces of accounting data (mod times, etc.) in the
> directory listing, but can't associate anything else with
> the file directly, at least not unless you create another
> file that has the associated data in it, or stuff the extra
> data in the file itself, but then that makes each file
> a database...see where it goes?  Sometimes it's better
> to come up with a schema you can extend rationally to
> fit the problem you are trying to solve.
> 
> --Blair
> 
Not quite sure why response "bothered" you so much but it
appears it did.  I'll admit that I was doing my best to read
the OP's mind in my answer.

Item 1 - The OP who specifically said he wanted to store 100's
of files.  You rarely need a database to store 100's of anything
and the overhead of installing and maintaining one isn't typically
worth the effort.  Store the info in a text file and read the
entire file into memory and do linear searches.  Python can search
100's of items in a list faster than you can even begin an SQL
query.

Item 2 - You will note that I said "If you need multiple indexes
into these files, then use a database, but only for the indexes
that point to the files on the filesystem".  You sometimes need
multiple indexes (which databases are GREAT at providing).

As far as "rational extension" is concerned, I think I can relate.
As a developer of imaging systems that store multiple-millions of
scanned pieces of paper online for customers, I can promise you
the file system is quite efficient at storing files (and that is
what the OP asked for in the original post) and way better than
storing in Oracle blobs.  Can you store them in the database,
absolutely.  Is it efficient and manageable.  It has been our
experience that it is not.  Ever tried to upgrade Oracle 9 to
Oracle 10 with a Tb of blobs?

-Larry



More information about the Python-list mailing list