[Mailman-Users] Chinese characters spam filter?
Mark Sapiro
mark at msapiro.net
Tue Jul 12 14:47:48 EDT 2016
On 07/12/2016 12:03 AM, Stephen J. Turnbull wrote:
> Mark Sapiro writes:
> > On 7/8/16 6:04 PM, Yasuhito FUTATSUKI wrote:
> > >
> > > How about using 'backslashreplace' instead of 'replace' to encode to
> > > list's preferred language in Mailman/Handlers/SpamDetect.py ?
>
> I see you've already done this, but ...
>
> I would consider xmlrefreplace as well. xmlrefs are something most
> people (users/moderators) have seen, backslash they're not going to
> recognize unless they're programmers.
I have now switched to xmlcharrefreplace instead of backslashreplace as
I agree this will be easier to explain and understand. I was uncertain
about this at first because I didn't know that xmlcharrefreplace
wouldn't use entity names in some cases, but it appears that it only
uses numeric references.
> At an earlier stage, you could also just do a trial re-encoding with
> the list preferred codec, set errors = 'strict', catch the Exception,
> and re-raise as a Hold (or Discard, according to per-list policy).
> (Then discard the output.) I would prefer this solution, I think, as
> creating regexps turns out to be an issue for many list owners.
>
> People would have to learn not to use emoji in headers, of course, or
> suffer moderation delays or even discards.
I think this will have too many undesired effects. Not just emoji, but
accented latin or CJK characters, etc. in display names would I think be
real problems, even on English language lists.
--
Mark Sapiro <mark at msapiro.net> The highway is for gamblers,
San Francisco Bay Area, California better use your sense - B. Dylan
More information about the Mailman-Users
mailing list