[Mailman-Developers] Protecting email addresses from spam harvesters
Jay R. Ashworth
jra@baylink.com
Tue, 26 Feb 2002 12:36:49 -0500
On Tue, Feb 26, 2002 at 12:56:45AM -0500, Barry A. Warsaw wrote:
> JRA> I do see one problem here, and I don't know if you already
> JRA> address it below. [ looks ] You don't; it's this: if the
> JRA> list-owner addresses go through the MM machinery, as well,
> JRA> then they too can die if MM crashes the wrong way.
>
> JRA> This implies, as I believe has already been discussed, that
> JRA> the *server* admin address must be publicly accessible, not
> JRA> be piped into MailMan at all, and preferably, should actually
> JRA> not even be handled by the same machine... ("Single point of
> JRA> failure")
>
> Well, what machine it's handled by isn't Mailman's business, but you
> do have a point. Until recently, I recommended that you install
> aliases `mailman' and `mailman-owner', but now I recommend that
> `mailman' be an actual list, and it is from this list that things like
> password reminders look to come from. Also, if the site list gets a
> bounce, it'll check all the existing lists for a match against the
> bouncing address.
Hmmm...
> You make the valid point that if the Mailman system were to break,
> you'd have no way to contact the site administrator, save for typical
> aliases like postmaster. It seems like you want:
>
> - A non-list, plain alias to contact the human in case of emergency
Yep; and it's fine if this is an alias; I agree with Chuq's opinion
about 'Real people", but I don't mind *sending* to a role account, as
long as the *reply* comes from a human, with a .sig file.
> - Some place that password reminders come from. Since this will be
> receiving bounces, it ought to be a real list.
Yeah, probably.
> - A site-wide list of maintainers of the site who can take care of
> normal operations (i.e. panicky unsubscription requests).
>
> Perhaps #3 can be the same as #1 for those sites that have a
> collaborative management arrangement. So the question is, what do we
> call the alias and what do we call the list? I have definitely seen
> people try to send mail commands to `mailman@python.org' and from my
> Majordomo days, this seems like a reasonable thing to (eventually)
> implement. Is it sufficient to recommend that postmaster@ point to a
> real human, not a list, and leave mailman@dom.ain a normal list?
Hmmm... I see the problem: mailman is the obvious alias for the server
admin, but I also see why you want to leave it a list.
*I* think that postmaster@ the mailing list machine (or domain) is a
good enough answer, but I think Chuq will accuse me of geeking out
again, and on this one, I'm afraid I'd agree with him.
The number of people on the net with *no* indoctrination at all is
truly stunning.
> If not, i'd still opt for `mailman' to be the site list, and add
> something like mailman-panic to be a human address. Or perhaps make
> mailman-owner pipe both to the wrapper and to postmaster. I dunno,
> I'm open to suggestions.
Well, here's the problem: it has to be predictable, because
1) you can't put in every mail footer cause you don't *want* people
using it unless something goes Horribly Wrong, but
2) if something *does* go Horribly Wrong, they won't be *able* to get
it from the website...
> > Mailman should avoid getting deeply into the spam detection and
> > prevention business, except for some really really basic stuff
> > (probably not much more or less than it does now). It should
> > integrate well with external spam detection programs like SpamAssassin
> > or commercial equivalents. E.g. if we always send the message through
> > SA, and the message gets some score, we could decide to hold messages
> > below say 5.0 on the Spamster Scale, discard anything about 5.0, etc.
>
> JRA> That sounds good, and if there isn't already a "plugin" API
> JRA> for that, we ought to give some thought to that...
>
> Agreed. I just have no idea what a reasonable API would be, although
> we're planning on doing some experiments with SA on {python,zope}.org
> to see what might make sense.
I suspect that, at least for Unixy installs, a system call to the
appropriate binary, with percentized arguments, will fill the bill
nicely; you can catch the exit value -- and if your package doesn't do
it that way, you can write a script to parse the output and send a
return value.
I, personally, would re-read the message from the file I put it in, in
case someone's package (wants to) rewrite the MIME to remove and
quarantine suspicious attachments.
> > #4 is interesting too. I'm not against putting the raw archive behind
> > a turing-test, since I suspect that very few people will ever want
> > it. It means that we won't be able to write an automated wget-ish
> > script to do off-site backups, but so be it.
>
> JRA> Is there a difference between raw and private that I'm
> JRA> missing? Do you mean the mbox format files?
>
> Yup. raw == mbox.
Ok. I've often found it quite useful to snarf those down for lists I'm
not on (yet); I wouldn't mind having to prove I was human, though.
My real problem was just that the obfuscation breaks Google, and since
"Get the glue right" is one of my loudest systems-design mantras...
> JRA> Well, that's probably the best point yet: this isn't
> JRA> *MailMan's* problem, except to the extent that we "recommend"
> JRA> Piper as out archiver.
>
> I don't know if I recommend it, in fact I try to dis-recommend it.
Sounds like a good call to me...
> Still, I think we do more good than harm in distributing an archiver
> that works out of the box. And the advantage of Pipermail is that for
> really really critical problems, we /can/ go in and hack on it. I'm
> torn, but still come down on the side of including Pipermail, even
> with all its worts.
Until Zest is a solution...
> > - I'll note that one of the early design decisions for Pipermail was
> > that public archives should be vended directly from the file system
> > for performance reasons. That decision may not be appropriate for
> > today's operations. Certainly maintaining two static versions of
> > the pages isn't feasible, so I think you have to vend one or the
> > other (probably the obfuscated version) from a cgi.
>
> JRA> No, but the performance reasons aren't as much of an issue
> JRA> now...
>
> Nope.
Optimizing for performance in the core design of a system is nearly
always a bad idea, at least on this end of the performance curve.
If you're redesigning Amadeus, or SABRE; perhaps not.
Cheers,
-- jra
--
Jay R. Ashworth jra@baylink.com
Member of the Technical Staff Baylink RFC 2100
The Suncoast Freenet The Things I Think
Tampa Bay, Florida http://baylink.pitas.com +1 727 647 1274
"If you don't have a dream; how're you gonna have a dream come true?"
-- Captain Sensible, The Damned (from South Pacific's "Happy Talk")