[Mailman-Developers] Requirements for a new archiver
Brad Knowles
brad.knowles at skynet.be
Wed Oct 29 15:25:53 EST 2003
At 11:54 AM -0800 2003/10/29, Peter C. Norton wrote:
> It always confounds me that people will go for database voodoo and
> deride filesystems when a filesystem is a highly specialised database
> in and of itself.
I am aware of that. I was aware of that when I first gave my
invited talk entitled "Design and Implementation of Highly Scalable
E-mail Systems", which you can find at
<http://www.shub-internet.org/brad/papers/dihses/>.
Note that Eric Allman (author of the original Ingres database,
among many other things) and Kirk McKusick (author of the Berkeley
Fast File System) were in the audience. I did not embarrass myself.
> Databases aren't meant to be storage for abstract binary data.
> They're meant to be a searchable index of data of types they
> understand.
Correct. And despite all claims to the contrary from the
vendors, no database properly "understands" binary large objects, nor
do they give you another datatype they do actually understand that
would be suitable for the storage of e-mail message bodies.
> Assuming I had a clean slate to start a database project for a mail
> store, personally I'd much rather prototype it in something like
> postgresql where I could add data types to deal with email. I could
> then make header types, text types, mime types classes, etc. Then I
> could test to see if it was a good idea to implement it.
IMO, that would be an exercise in futility. We've been down this
road a million times before. We don't need to go down it again to
know that the result is not likely to be successful, especially when
we have alternatives that are proven to work well -- we store the
message meta-data in the database, and then the message bodies in an
separate message store akin to INN timecaf/timehash "heaps" (see
<http://www.shub-internet.org/brad/papers/dihses/lisa2000/sld090.htm>).
> I think using a standard sql database for doing mail operations is
> asking for trouble. Standard databases don't know how to parse
> rfc822/2822 headers and that means that you've got to either write a
> whole lot of stored procedures in a clunky query language (or
> java!?!?!) and then maintain it, or you've got to do it all in the
> imap/pop3/whatever server which means a whole lot of yammering traffic
> between the database and the I/P/W server all the time, which == slow.
You don't ask the database to understand or parse RFC2822 headers
or messages. That's up to your application. You just store data
using the formats known to the database, and the message bodies
according to the methods above.
--
Brad Knowles, <brad.knowles at skynet.be>
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)
More information about the Mailman-Developers
mailing list