[Mailman-Developers] Requirements for a new archiver

Brad Knowles brad.knowles at skynet.be
Wed Oct 29 23:00:48 EST 2003


At 10:47 PM -0500 2003/10/29, Barry Warsaw wrote:

>  I'm not sure if Andrew Koenig is on this list, but he described an
>  algorithm he developed to quickly find messages in an mbox file.  If
>  he's here, maybe he can talk about it.

	7th edition mbox files are a pain.  There are other mailbox file 
formats that are much better and easier to parse (UW-IMAP .mbx being 
one).

>  I really don't like mbox files, primarily because they require munging
>  From lines in the body of the message.  MMDF would be better, but I
>  think ideal from a philosophical point of view would be
>  one-message-per-file if it can be done efficiently cross-platform.

	Therein lies the problem.  Some filesystems make this more 
feasible than others, at least on larger scale systems.

>  Maybe file system experts here can provide pointers or advice on exactly
>  which file and operating systems make this approach feasible, even for
>  huge message counts.

	SGIs XFS on Irix does a pretty good job, with hashed directory 
structures, and an extent-based journaling filesystem.  Regretfully, 
I don't think that all of these features are fully supported under 
the Linux version of XFS, and that work has basically ground to a 
halt with the lay-offs of all the key SGI people who had been working 
on XFS.  Veritas VxFS also does a good job in this area.

	Other than SGI XFS for Irix and Veritas VxFS, I don't know of any 
good solutions to this problem at the filesystem level.


	Kirk McKusick and Eric Allman agree with you that this is a 
proper filesystem problem that should be solved at the filesystem 
level (at least, that's what they've said to me when I brought this 
issue up to them), and they feel you should not attempt to solve 
filesystem problems with "tricks" like INN timecaf/timehash cycbufs.

	However, while that's nice in theory, that doesn't necessarily 
help us here in the real world.

-- 
Brad Knowles, <brad.knowles at skynet.be>

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
     -Benjamin Franklin, Historical Review of Pennsylvania.

GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)



More information about the Mailman-Developers mailing list