[Mailman-Developers] Grackle archive framework

Aamir Khan syst3m.w0rm at gmail.com
Sun Mar 18 06:25:19 CET 2012


On Sun, Mar 18, 2012 at 4:24 AM, Barry Warsaw <barry at list.org> wrote:

> On Mar 18, 2012, at 12:23 AM, Aamir Khan wrote:
>
> >On Fri, Feb 17, 2012 at 12:55 AM, Barry Warsaw <barry at list.org> wrote:
> >> On IRC, we talked about a storm + Python mailbox library based backend,
> >> with a
> >> REST+JSON wsgi based application vending the data.  This would allow us
> to
> >> integrate fairly easily with MM3 I think, and would possibly better
> enable
> >> some of the archiver work being done by Terri and others.
> >>
> >
> >I understand that we will store the messages in .mbox format. But I don't
> >understand why do we need to use storm for the archiving purpose.
>
> I meant to say "maildir".  Please let's not use mbox format!  It's way too
> easy to corrupt the file, as we did with a bug once in MM2.1, and we've
> paid
> the price ever since.
>

I read the difference between maildir and mbox format and it clearly states
that mbox is prone to corruption while maildir is not. Also there are more
advantages using maildir in a way that there is no file locking problem.
But since we will be storing each mail in a separate file, searching
through them will not as fast enough. Using database alone also have
problems like, it will use more hard disk, more CPU cycles will be consumed.

So, if we can store the messages in maildir format with a copy of it it
database. we can serve the searching request using database query which
will powered by full-text search engine. But then there will be problems of
synchronization between the maildir messages and  messages stored in
database. What are your thoughts about it ?

As for searching the archive, there are solutions like Elastic Search,
Solr, lucene. Can we use one of them to search directly through the maildir.

>
> As for archiving, it isn't strictly necessary to use storm, it's just a
> nice
> lightweight ORM I happen to like.  But I think it *does* make sense to
> back a
> full-fledged archiver with a database and a full-text search engine.  For
> example, using our RFC 5064+X-Message-ID-Hash scheme, the database would
> handle the lookup from hash to actual message storage location.
>
> Cheers,
> -Barry
>



-- 
Aamir Khan | 3rd Year  | Computer Science & Engineering | IIT Roorkee


More information about the Mailman-Developers mailing list