[Archiver-dev] Continuously crawling an email list

Thu Feb 10 15:51:50 CET 2011

Hi Matt.  For mail-archive.com we essentially do what you mention: we have a
global email address (archive at mail-archive.com) that list admins can
subscribe to their list, and then our service converts the received messages
into MHonArc archives.  When a list admin comes to us with a request to load
up old archives we chew through the old mbox files and pump them into our
MHonArc-digesting code.

I'm not aware of any scripts that would follow a list in an easier fashion.
 A benefit to subscribing an address to the list is you make the list admin
aware of who you are, and they have the option of saying no.

Jeff

On Tue, Feb 8, 2011 at 11:20 AM, Matt Chaput <matt at sidefx.com> wrote:

> Hi, just saw this list, it seemed related to what I want to do, so I signed
> up :)
>
> I want to create a web app that indexes email messages as they appear in a
> MailMan list and makes them available for search.
>
> It seems like one way to do this would be to create an email account, sign
> it up to the list, and use an IMAP4 client to poll the account and download
> new messages.
>
> But is that the best way? For one thing, it doesn't allow the batch
> indexing of old list messages. For that I'd have to download tar'd archives
> and support separate indexing paths for old (archived), newish (downloaded
> recently) and new (just pulled out of the account) messages.
>
> Is there a good way to have a script "follow" an email list? And better
> yet, is there already code out there to do so ;)
>
> Thanks,
>
> Matt
> _______________________________________________
> Archiver-dev mailing list
> Archiver-dev at python.org
> http://mail.python.org/mailman/listinfo/archiver-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/archiver-dev/attachments/20110210/0362df82/attachment.html>