[Mailman-Developers] Requirements for a new archiver

Barry Warsaw barry at python.org
Wed Oct 29 22:47:59 EST 2003


On Wed, 2003-10-29 at 14:38, Chuq Von Rospach wrote:

> Hint: look at what INN did when they implmented cycbufs.
> 
> Effectively, you create 1-N files, or create files as needed. Each file 
> is N bytes long, pre-allocated on file creation. When you store 
> messages, they're written into the file sequentially (or any other way 
> you want. If you want to get into best fit allocations and turn this 
> into a malloc() style heap, be my guest).
> 
> Metadata to access the info is then a filename, and an lseek() pointer 
> into the file, and # of bytes to read, plus your normal identifying 
> info. It's fast, it's efficient use of file pointers, it avoids the 
> worst aspects of the unix file system, and I'm amazed nobody ever 
> thinks to use it for other purposes (or that it took that long for 
> usenet people to discover it, I suggested a simpler variant of it back 
> in the 80s and was told inodes are our friends...)


I'm not sure if Andrew Koenig is on this list, but he described an
algorithm he developed to quickly find messages in an mbox file.  If
he's here, maybe he can talk about it.

I really don't like mbox files, primarily because they require munging
>From lines in the body of the message.  MMDF would be better, but I
think ideal from a philosophical point of view would be
one-message-per-file if it can be done efficiently cross-platform. 
Maybe file system experts here can provide pointers or advice on exactly
which file and operating systems make this approach feasible, even for
huge message counts.

> you can even do expiration/purge/etc if you want, by moving stuff 
> around and changing the pointers.
> 
> I've even thought of using it as the backing store for a picture 
> library. With a nice relational database and a series of these "data 
> boxes", I think you have store data in the best and fastest possible 
> way...

It's a very interesting idea.

-Barry





More information about the Mailman-Developers mailing list