[Mailman-Developers] Requirements for a new archiver

Brad Knowles brad.knowles at skynet.be
Wed Oct 29 16:14:52 EST 2003


At 12:37 PM -0800 2003/10/29, Peter C. Norton wrote:

>  It seems like you're only partially agreeing/disagreeing with me
>  (optimist/pessamist).  Disagreeing: you're saying that using datatypes
>  in the database which are appropriate to the kind of data being stored
>  (mail messages) is an excercise in futility.

	Not quite.  I believe that there are no databases in existence 
which have data types that are actually appropriate for the storage 
of message bodies.

>                                                But, agreeing: that
>  storing these in a database in another way is OK.

	Not quite.  Store meta-data, yes.  The entire message, no.


	Store things like who the message is from, who the message is 
addressed to, the date, the message-id as it was found in the 
headers, etc....  Basically, store just about everything in the 
message headers that a client would be likely to ask about.  That's 
all well and good.

	But when it comes to storing the message body itself, it should 
be stored in wire format (i.e., precisely as it came in), in the 
filesystem.  Then pointers to the location in the filesystem should 
be put into the database.


	One key factor here is that all of the information in the 
database should be able to be re-created from the message bodies 
alone, if there should happen to be a catastrophic system crash.

	The sole purpose of the database is to speed up access to the 
messages and the message content -- indeed, to speed it up enough so 
that randomly accessing most any piece of information about any 
message from any sender to any recipient in any mailbox should become 
something feasible to contemplate.

	The sole purpose of the database is to make the difficult and 
slow (on the large scale) quick and easy, and to make the things that 
would be totally impossible (on any reasonable scale) at least 
something that can now be considered.

>                                                     I don't get why
>  you'd just want to store these as text when you have databases that
>  can be made more suitable to the problem.

	I don't believe that there are any databases in existence that 
"... can be made more suitable to the problem."

>  So all the parsing happens in the database client side.  Which is slow.

	Yup.  I don't see any way around that.

-- 
Brad Knowles, <brad.knowles at skynet.be>

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
     -Benjamin Franklin, Historical Review of Pennsylvania.

GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)



More information about the Mailman-Developers mailing list