[Mailman-Users] Deleting old msgs?

Richard Haas rhaas at rhaas.us
Sat Dec 3 17:50:41 CET 2011


Mark Sapiro <mark <at> msapiro.net> writes:

> 
> Frank Bell wrote:
> 
> >Is there any easier way? I just took over our mailman server and we have 
> >several years worth of messages in 190+ mboxs totaling approx 130 gig 
> >and a few 100K msgs.
> 
> You inspired me. I've created a script for pruning archives. See "NOTE
> ON PRUNING OLD MESSAGES:" in the FAQ at <http://wiki.list.org/x/2YA9>
> for links.
> 
> Since this is a brand new process, I suggest you make backup copies of
> the LISTNAME.mbox files before starting. The script has a --backup
> option, but I would make separate backups to be sure until you've run
> the script successfully.
> 

:-) Timing is everything ... just finished integrating mbox-purge.pl 
(http://www.argon.org/~roderick/mbox-purge.html) with a withlist 
callable module to do the same thing.

Having fewer layers would be welcome though, so thanks for this script, 
Mark.

One idea/request: Would you be willing to add the logic to write 
the pruned message data to a supplied path+filename? That would let 
the script dump the pruned data where it could be retained or aged via 
another scheme.

Our site (and maybe this is more common) periodically prunes the 
archived .mbox messages when they are a year old, rebuilding the 
pipermail hierarchy, but keeps a compressed copy of the pruned data 
for another year (or longer), in case it is needed.  The compressed 
pruned .mbox text is considerably smaller (like 1/20th or better on 
average) when compared to the uncompressed .mbox plus the 
associated pipermail HTML hierarchy -- so keeping a copy is a 
relatively trivial insurance policy or "nice to have" for our lists.

We've found that pruning is essential once archives become 
multi-gigabyte, not for the .mbox archives themselves, but due to 
the pipermail HTML files that result (particularly for archives with 
many small messages). We've seen 3 GB .mbox archives with 
approaching 1 million files in the pipermail hierarchy. Traversing 
that many files or rebuilding them, particularly for hundreds 
of such lists, is non-trivial even on modern hardware and file 
systems. 

Thanks in advance for considering adding a way to save the pruned
data.


Richard







More information about the Mailman-Users mailing list