[Mailman-Developers] Requirements for a new archiver
Brad Knowles
brad.knowles at skynet.be
Wed Oct 29 16:14:52 EST 2003
At 12:37 PM -0800 2003/10/29, Peter C. Norton wrote:
> It seems like you're only partially agreeing/disagreeing with me
> (optimist/pessamist). Disagreeing: you're saying that using datatypes
> in the database which are appropriate to the kind of data being stored
> (mail messages) is an excercise in futility.
Not quite. I believe that there are no databases in existence
which have data types that are actually appropriate for the storage
of message bodies.
> But, agreeing: that
> storing these in a database in another way is OK.
Not quite. Store meta-data, yes. The entire message, no.
Store things like who the message is from, who the message is
addressed to, the date, the message-id as it was found in the
headers, etc.... Basically, store just about everything in the
message headers that a client would be likely to ask about. That's
all well and good.
But when it comes to storing the message body itself, it should
be stored in wire format (i.e., precisely as it came in), in the
filesystem. Then pointers to the location in the filesystem should
be put into the database.
One key factor here is that all of the information in the
database should be able to be re-created from the message bodies
alone, if there should happen to be a catastrophic system crash.
The sole purpose of the database is to speed up access to the
messages and the message content -- indeed, to speed it up enough so
that randomly accessing most any piece of information about any
message from any sender to any recipient in any mailbox should become
something feasible to contemplate.
The sole purpose of the database is to make the difficult and
slow (on the large scale) quick and easy, and to make the things that
would be totally impossible (on any reasonable scale) at least
something that can now be considered.
> I don't get why
> you'd just want to store these as text when you have databases that
> can be made more suitable to the problem.
I don't believe that there are any databases in existence that
"... can be made more suitable to the problem."
> So all the parsing happens in the database client side. Which is slow.
Yup. I don't see any way around that.
--
Brad Knowles, <brad.knowles at skynet.be>
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)
More information about the Mailman-Developers
mailing list