[Mailman-Users] Still having trouble with an email filter vis-a-vis Chinese crap.

Mark Sapiro mark at msapiro.net
Sat May 5 16:20:26 EDT 2018


On 05/05/2018 12:44 PM, Kenneth G. Gordon wrote:
> In my mailman Privacy options/Spam filters/Regexps my expression:
> 
> ^Subject: =\?utf-8\?B\?
> 
> does NOT appear to work to discard all posts with that expression in the subject line.
> 
> That traffic always contains this:
> 
> Subject =?utf-8?B?


Do you really mean it doesn't contain the ':'?

You could try

^Subject:?\s*=\?utf-8\?B\?

which would match Subject followed by a colon or not and any amount of
white space.


> before all the following garbage.
> 
> Sometimes that expression beginning with = is contained multiple times in the traffic I want 
> to discard.


That's because each '=?utf-8?B?...?=' is and RFC 2047 encoded 'word' and
there can be  any number of them in the Subject: header.


> As you can see above, I have escaped all the ? characters.
> 
> Is there something further I need to do to make this work as it should?


What you have will match a line beginning with 'Subject: =?utf-b?B?'
case insensitively, but only if there is a colon followed by exactly one
space.


> Obviously, I am still doing something wrong, but I can't see it.
> 
> I thought the escape character was the \, but maybe it is a / . ?


It is '\'.


> I am, presently, not all that happy, although I have cut down the Chinese garbage by about 
> 90% since I implemented other filters. There remains the 10% which is still very annoying.


As has been mentioned before the above pattern will match any Subject:
header which begins with a base-64 RFC 2047 encoded word with a utf-8
encoding. This includes some non-english language subjects (more than
just Chinese) and also some English language subjects that might begin
with an emoji or other non-ascii symbol and doesn't include Chinese
language subjects that might be encoded in gb-2312 or some other
non-utf-8 encoding.

This may work for you, but in general might discard a wanted post.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan


More information about the Mailman-Users mailing list