[Mailman-Users] regexes in bounce_matching_headers

Joern Nettingsmeier nettings at folkwang-hochschule.de
Thu Apr 18 19:00:45 CEST 2002


[i posted this a while ago, but obviously my subscription had not
yet been processed when it reached the list.
sorry for any dupes that might occur. i'm in a bit of a hurry, so
i'm resending.]


hello mailman users !


a number of questions about regexes in mailman:


in bounce_matching_headers, the "Header:" field is parsed separately
and is not part of the regex, right ?
(i.e. a line
.*SPAM
would _not_ have the effect of catching all mails that contain
"SPAM" in any of their headers ?)
this is not quite clear to me.


in the "details" page, it says that matches are case-insensitive.
however, my shouting filter
Subject: .*[A-Z ]{10,}
seems to work. or am i seeing things ?


after a longish web-search, i dug up
http://www.python.org/doc/current/lib/re-syntax.html, which i found
quite essential to proper filtering.
i'd vote to have a pointer to that page included in the "details"
page for all options that take regexes. 
can you mention the python function that's used to parse regexes on
the "details" page as well, together with all the flags ? of course
i could RTFS, but it would make things easier for the average
mailman admin.


it would also be nice if that page mentioned to check logs/config
for trivial syntax errors after changing rules.
i found out the hard way :)


is there a table somewhere that lists which regex features are
available with a given python version? i found it quite tedious to
weed through the archives to find out, and the answers were not
always consistent. however, i have asked my admin to upgrade python,
so that problem may be gone already.


i would like to weed out forwarded chain letters and jokecasts
("[Fwd: [Fwd: [Fwd:....", i.e. three or more forwards.
this does not work:
Subject: .*(fw.*){3,}
it catches even single forwards, obviously the .* is "greedy".
neither does this:
Subject: .*(fw.{1,5}){3,}
can you let me in on the correct way to do it ?


how do i catch more subtle administrivia, like subjects containing
both the words "list" and "help" in any order in any place ?


do the "forbidden_posters" and "posters" fields accept regexes ?
i have found contradictory answers in the archive.


does anyone have a tried and tested "thou shalt not reply to
digests" filter that i could rip off ?


plus i would really appreciate if some people with more elaborate
spam filters could posts theirs here with a very short description
on what they do. perhaps if we compile a bunch of useful examples, i
could write them up nicely and forward them to the mailman folks to
be included as documentation in the next release.


DISCLAIMER: i don't know python, and i have started to learn regexes
about 2 weeks ago. i'm just a stupid list admin. the problem is,
with spam and bullshit levels ever-increasing, most of us part-time
admins are forced to deal with filtering and other arcane stuff
without really being prepared, which is why i'm nagging for
examples...


best regards, and many thanks in advance,

jörn


--
Watch out where the huskies go and don't you eat
the yellow snow !
	- Frank Zappa





More information about the Mailman-Users mailing list