[Mailman-Users] Huge duplication of posts

Chuq Von Rospach chuqui at plaidworks.com
Wed Nov 22 00:51:01 CET 2000


>     CVR> Mixed blessing -- lots of overhead for very occasional acts
>     CVR> of chaos -- and the dataset you need to keep can get huge.
>
>So, is there an 80/20 rule we can adopt?  Is the `catchup' button
>enough or do we need more?

I'd do a couple of things.

First, I'd check the posting date of what's being sucked in from 
usenet. Anything older than {configurable} days is assumed to be a 
duplicate and dumped (that's the first line of defense for usenet, 
FWIW)

Second, you can do one of two things depending on how motivated you 
are. First is to keep track of all message-id's for {configurable} 
days and bounce dupes (that's the second line of defense for usenet), 
or you can monitor traffic and if you hit some significant uptick, 
put the intake on hold until the admin can evaluate.

Given how mailman is attaching to a NNTP site, I'd say keeping all 
Message-Ids for a week and bouncing everything older than a week 
would do it without major work and/or sets of data. but you're 
starting to *act* like an NNTP server here, which, I guess, since 
we're dealing with NNTP data, shouldn't surprise me..

-- 
Chuq Von Rospach - Plaidworks Consulting (mailto:chuqui at plaidworks.com)
Apple Mail List Gnome (mailto:chuq at apple.com)

The vet said it was behavioral, but I prefer to think of it as genetic.
It cuts down on the liability -- Get Fuzzy




More information about the Mailman-Users mailing list