[Mailman-Users] UTF-8 From and Reply-to addresses not getting properly processed.

Sat Feb 15 23:00:16 EST 2020

On 2/15/20 5:58 PM, Lindsay Haisley wrote:
> We're running Mailman 2.1.18-1 and have a list which is having a porn
> spam problem. The list is set to discard posts from non-members, and
> the list moderator has set various filters to try to filter on words
> which contain "f***", as many do, however the Subject, From and Reply-
> to addresses are all UTF-8 strings, and are apparently confusing
> Mailman's decision-making functions, and these posts are ending up in
> the administrative requests list.  Here's a sample set of headers:

Exactly what filters are used?

header_filter_rules will RFC 2047 decode the headers.
mm_cfg.KNOWN_SPAMMERS and bounce_matching_headers do not, but since
bounce_matching_headers only holds the message, I'm guessing you aren't
using that, and since list owners can't set mm_cfg.KNOWN_SPAMMERS, I'm
guessing you aren't using that either.

> MM is properly decoding the Subject in the message detail headers, but
> not the From address.
> 
> Is there any way to get these get Mailman to properly handle these?

If the only issue is the From: or other sender header, Mailman doesn't
RFC 2047 decode those in trying to determine if the sender is a member,
but what's the issue? If you are trying to match a specific address in
discard_these_nonmembers, I see the problem, but you can discard them by
setting generic_nonmember_action to discard.

If you only want to discard non-member posts with RFC 2047 encoded
From:, you could put something like

^[^@]+@[a-z0-9_.]+$

in hold_these_nonmembers to hold the ones that at least don't have
base64 encoded From:

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan