From szybalski at gmail.com  Wed Apr 25 00:13:24 2018
From: szybalski at gmail.com (Lukasz Szybalski)
Date: Tue, 24 Apr 2018 23:13:24 -0500
Subject: [moin-user] Moin in Debian Stable and anti-spam features
Message-ID: <CAKkTUv0uFhx6bcwGtkQKpZiFLH1X_mrBK8LQJ-ohCF6c02rFRg@mail.gmail.com>

Hello,
I have been running a moin moin setup for couple years now(
http://lucasmanual.com/mywiki/ ) . About 5 years ago I had to block the new
user signup due to uncontrolled amount of spam, and spam users.

I was hoping to re-enable the registration process but I wanted to know
more about current moin capabilities for stopping spammers?
captcha? are you a robot?


I know there is a page below but it doesn't really say or provide any
meaningful copy/paste instructions on how to secure you site on day 1.
https://moinmo.in/AntiSpamFeatures

I wanted to hear some feedback from people who run public facing moin moin
example: "debian wiki"  (https://wiki.debian.org/RecentChanges) that does
not seem to be having any spam at all?


Thanks
Lucas

-- 
http://lucasmanual.com/ <http://lucasmanual.com/blog/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/moin-user/attachments/20180424/92c59818/attachment.html>

From paul at boddie.org.uk  Wed Apr 25 06:29:31 2018
From: paul at boddie.org.uk (Paul Boddie)
Date: Wed, 25 Apr 2018 12:29:31 +0200
Subject: [moin-user] Moin in Debian Stable and anti-spam features
In-Reply-To: <CAKkTUv0uFhx6bcwGtkQKpZiFLH1X_mrBK8LQJ-ohCF6c02rFRg@mail.gmail.com>
References: <CAKkTUv0uFhx6bcwGtkQKpZiFLH1X_mrBK8LQJ-ohCF6c02rFRg@mail.gmail.com>
Message-ID: <201804251229.33601.paul@boddie.org.uk>

On Wednesday 25. April 2018 06.13.24 Lukasz Szybalski wrote:
> Hello,
> I have been running a moin moin setup for couple years now(
> http://lucasmanual.com/mywiki/ ) . About 5 years ago I had to block the new
> user signup due to uncontrolled amount of spam, and spam users.
> 
> I was hoping to re-enable the registration process but I wanted to know
> more about current moin capabilities for stopping spammers?
> captcha? are you a robot?

I haven't seen any recent developments around this. The Debian people can 
presumably say more, but they were using some kind of mail-based verification, 
which Moin does also support to some degree. This isn't sufficient to prevent 
spammer sign-ups, however.

> I know there is a page below but it doesn't really say or provide any
> meaningful copy/paste instructions on how to secure you site on day 1.
> https://moinmo.in/AntiSpamFeatures

I think the basic features are inadequate these days. The spam pattern 
blacklisting is almost useless for public sites; textcha doesn't really cope 
with spamming particularly well any more.

It is even necessary to prevent people *trying* to register new accounts, as 
this can easily cause user account data to accumulate in large volumes, even 
when those users won't have editing rights. Out of the box, for public sites, 
the newaccount action shouldn't be enabled.

> I wanted to hear some feedback from people who run public facing moin moin
> example: "debian wiki"  (https://wiki.debian.org/RecentChanges) that does
> not seem to be having any spam at all?

It wouldn't surprise me if many sites had a tightly-controlled group of 
editing users and an external workflow for user registration. That ends up 
being acceptable because it actually promotes higher quality content, but it 
creates a burden around administering the site.

And sometimes these external workflows fail to filter out spammers, as I saw 
on one occasion with the Python Wiki where, amongst the requests to edit the 
wiki, a spammer managed to persuade the administrators that their request was 
genuine.

I did work on some Moin extensions to mitigate spamming. One put edits in a 
request queue, but even if that prevents spammers getting the satisfaction of 
seeing their spams published, the feedback loop is not strong enough to 
prevent them from trying anyway, burdening the administrators of the wiki.

Another extension I did but actually forgot about was one that does timing 
measurements on edits to prevent automated spamming, which is something that 
things like WordPress use to prevent comment spamming. Although this might be 
useful, I think you'd still need a collection of other measures for it to be 
effective.

My conclusion these days is that trust-based mechanisms are probably the way 
forward. Like the external workflows that try and establish whether a new user 
is someone people "know" in some way, there could be an approach where 
existing users could approve others, and much of this could be automated. 
Maybe some way of retracting editing privileges and reverting compromised 
content would also be a part of such a solution.

Even though this message doesn't give any easy remedies, I hope it is still 
useful.

Paul

From cfreemes at ur.rochester.edu  Wed Apr 25 08:11:21 2018
From: cfreemes at ur.rochester.edu (Chris Freemesser)
Date: Wed, 25 Apr 2018 08:11:21 -0400
Subject: [moin-user] Moin in Debian Stable and anti-spam features
In-Reply-To: <201804251229.33601.paul@boddie.org.uk>
References: <CAKkTUv0uFhx6bcwGtkQKpZiFLH1X_mrBK8LQJ-ohCF6c02rFRg@mail.gmail.com>
 <201804251229.33601.paul@boddie.org.uk>
Message-ID: <9902dfc5-764e-0d80-b9fd-ee7f650eeadf@ur.rochester.edu>

On 04/25/2018 06:29 AM, Paul Boddie wrote:

> It is even necessary to prevent people *trying* to register new accounts, as
> this can easily cause user account data to accumulate in large volumes, even
> when those users won't have editing rights. Out of the box, for public sites,
> the newaccount action shouldn't be enabled.
Agreed...on my wiki server I have it disabled out of necessity.  On the 
(somewhat rare) occasion that I need to create a new account for one of 
my users I'll re-enable it long enough so I can create the account. 
Even if it only take me 60 seconds to do so, the spambots will have 
created an account or two that I have to manually delete later.  I've 
thought about trying to find a way to limit access to the account 
creation page to just the IP range of my institution but haven't 
actually looked into it.

Chris

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Chris Freemesser, Systems Administrator
Dept. of Brain & Cognitive Sciences +
The Center for Visual Science
University of Rochester
255 Meliora Hall
585-275-0786
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

From mscottreynolds at gmail.com  Sat Apr 28 23:27:10 2018
From: mscottreynolds at gmail.com (M. Scott Reynolds)
Date: Sun, 29 Apr 2018 03:27:10 +0000
Subject: [moin-user] Moin in Debian Stable and anti-spam features
In-Reply-To: <CAKkTUv0uFhx6bcwGtkQKpZiFLH1X_mrBK8LQJ-ohCF6c02rFRg@mail.gmail.com>
References: <CAKkTUv0uFhx6bcwGtkQKpZiFLH1X_mrBK8LQJ-ohCF6c02rFRg@mail.gmail.com>
Message-ID: <CAAc8TsUqP=yV8UJ=by9EyR_upOrFUaqMwBohvLLRd-Hm8UnA4Q@mail.gmail.com>

For my site I removed the prompt to create an account from the login page
and followed the instructions on this page,
https://moinmo.in/FeatureRequests/DisableUserCreation, to allow only
superusers to create accounts.

Scott R.


On Tue, Apr 24, 2018 at 10:13 PM Lukasz Szybalski <szybalski at gmail.com>
wrote:

> Hello,
> I have been running a moin moin setup for couple years now(
> http://lucasmanual.com/mywiki/ ) . About 5 years ago I had to block the
> new user signup due to uncontrolled amount of spam, and spam users.
>
> I was hoping to re-enable the registration process but I wanted to know
> more about current moin capabilities for stopping spammers?
> captcha? are you a robot?
>
>
> I know there is a page below but it doesn't really say or provide any
> meaningful copy/paste instructions on how to secure you site on day 1.
> https://moinmo.in/AntiSpamFeatures
>
> I wanted to hear some feedback from people who run public facing moin moin
> example: "debian wiki"  (https://wiki.debian.org/RecentChanges) that does
> not seem to be having any spam at all?
>
>
> Thanks
> Lucas
>
> --
> http://lucasmanual.com/ <http://lucasmanual.com/blog/>
> _______________________________________________
> moin-user mailing list
> moin-user at python.org
> https://mail.python.org/mailman/listinfo/moin-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/moin-user/attachments/20180429/9a12932b/attachment.html>

From szybalski at gmail.com  Sun Apr 29 12:04:40 2018
From: szybalski at gmail.com (Lukasz Szybalski)
Date: Sun, 29 Apr 2018 16:04:40 +0000
Subject: [moin-user] Moin in Debian Stable and anti-spam features
In-Reply-To: <201804251229.33601.paul@boddie.org.uk>
References: <CAKkTUv0uFhx6bcwGtkQKpZiFLH1X_mrBK8LQJ-ohCF6c02rFRg@mail.gmail.com>
 <201804251229.33601.paul@boddie.org.uk>
Message-ID: <CAKkTUv14T-5DS1kfkP17a1xiAiJPuLFQX-1SSECxGLxTAgc0+Q@mail.gmail.com>

On Wed, Apr 25, 2018, 5:29 AM Paul Boddie <paul at boddie.org.uk> wrote:

> On Wednesday 25. April 2018 06.13.24 Lukasz Szybalski wrote:
> > Hello,
> > I have been running a moin moin setup for couple years now(
> > http://lucasmanual.com/mywiki/ ) . About 5 years ago I had to block the
> new
> > user signup due to uncontrolled amount of spam, and spam users.
> >
> > I was hoping to re-enable the registration process but I wanted to know
> > more about current moin capabilities for stopping spammers?
> > captcha? are you a robot?
>
> I haven't seen any recent developments around this. The Debian people can
> presumably say more, but they were using some kind of mail-based
> verification,
> which Moin does also support to some degree. This isn't sufficient to
> prevent
> spammer sign-ups, however.
>
> > I know there is a page below but it doesn't really say or provide any
> > meaningful copy/paste instructions on how to secure you site on day 1.
> > https://moinmo.in/AntiSpamFeatures
>
> I think the basic features are inadequate these days. The spam pattern
> blacklisting is almost useless for public sites; textcha doesn't really
> cope
> with spamming particularly well any more.
>
> It is even necessary to prevent people *trying* to register new accounts,
> as
> this can easily cause user account data to accumulate in large volumes,
> even
> when those users won't have editing rights. Out of the box, for public
> sites,
> the newaccount action shouldn't be enabled.
>
> > I wanted to hear some feedback from people who run public facing moin
> moin
> > example: "debian wiki"  (https://wiki.debian.org/RecentChanges) that
> does
> > not seem to be having any spam at all?
>
> It wouldn't surprise me if many sites had a tightly-controlled group of
> editing users and an external workflow for user registration. That ends up
> being acceptable because it actually promotes higher quality content, but
> it
> creates a burden around administering the site.
>
> And sometimes these external workflows fail to filter out spammers, as I
> saw
> on one occasion with the Python Wiki where, amongst the requests to edit
> the
> wiki, a spammer managed to persuade the administrators that their request
> was
> genuine.
>
> I did work on some Moin extensions to mitigate spamming. One put edits in
> a
> request queue, but even if that prevents spammers getting the satisfaction
> of
> seeing their spams published, the feedback loop is not strong enough to
> prevent them from trying anyway, burdening the administrators of the wiki.
>
> Another extension I did but actually forgot about was one that does timing
> measurements on edits to prevent automated spamming, which is something
> that
> things like WordPress use to prevent comment spamming. Although this might
> be
> useful, I think you'd still need a collection of other measures for it to
> be
> effective.
>
> My conclusion these days is that trust-based mechanisms are probably the
> way
> forward. Like the external workflows that try and establish whether a new
> user
> is someone people "know" in some way, there could be an approach where
> existing users could approve others, and much of this could be automated.
> Maybe some way of retracting editing privileges and reverting compromised
> content would also be a part of such a solution.
>
> Even though this message doesn't give any easy remedies, I hope it is
> still
> useful.
>
>
>
Thank you.
So when we look at some growing community that need to allow public to
expend like GitHub, would it maybe make sense to allow registration with
active GitHub account aka "login/register with GitHub account"?
(I think similar Gmail login would have similar spam issue as you discribed
above)
I wonder what other communities do?
What about "I'm not a robot" Google new captcha?
Or what does medium.com do?

I really would like to grow the userbase, and get more content.

Thank you
Lucas


>
>
>
> Paul
> _______________________________________________
> moin-user mailing list
> moin-user at python.org
> https://mail.python.org/mailman/listinfo/moin-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/moin-user/attachments/20180429/bd47c13f/attachment.html>

From steve at einval.com  Mon Apr 30 14:05:18 2018
From: steve at einval.com (Steve McIntyre)
Date: Mon, 30 Apr 2018 19:05:18 +0100
Subject: [moin-user] Moin in Debian Stable and anti-spam features
In-Reply-To: <CAKkTUv14T-5DS1kfkP17a1xiAiJPuLFQX-1SSECxGLxTAgc0+Q@mail.gmail.com>
References: <CAKkTUv0uFhx6bcwGtkQKpZiFLH1X_mrBK8LQJ-ohCF6c02rFRg@mail.gmail.com>
 <201804251229.33601.paul@boddie.org.uk>
 <CAKkTUv14T-5DS1kfkP17a1xiAiJPuLFQX-1SSECxGLxTAgc0+Q@mail.gmail.com>
Message-ID: <20180430180518.xiteowlwbhge26qx@tack.einval.com>

On Sun, Apr 29, 2018 at 04:04:40PM +0000, Lukasz Szybalski wrote:
>
>So when we look at some growing community that need to allow public to expend
>like GitHub, would it maybe make sense to allow registration with active GitHub
>account aka "login/register with GitHub account"?

That might work, yes.

>(I think similar Gmail login would have similar spam issue as you discribed
>above)?

The free email hosting providers are always a source of spam. We can't
blacklist them as there are so many valid users. So we started by
blocking (but not blacklisting) signup attempts from all the free mail
providers (gmail, hotmail/outlook/live, yahoo, etc.). It helped.

>I wonder what other communities do??
>What about "I'm not a robot" Google new captcha??

Too easily broken for us, I'll be honest.

-- 
Steve McIntyre, Cambridge, UK.                                steve at einval.com
  Mature Sporty Personal
  More Innovation More Adult
  A Man in Dandism
  Powered Midship Specialty


From steve at einval.com  Mon Apr 30 14:00:22 2018
From: steve at einval.com (Steve McIntyre)
Date: Mon, 30 Apr 2018 19:00:22 +0100
Subject: [moin-user] Moin in Debian Stable and anti-spam features
In-Reply-To: <CAKkTUv0uFhx6bcwGtkQKpZiFLH1X_mrBK8LQJ-ohCF6c02rFRg@mail.gmail.com>
References: <CAKkTUv0uFhx6bcwGtkQKpZiFLH1X_mrBK8LQJ-ohCF6c02rFRg@mail.gmail.com>
Message-ID: <20180430180017.kotdhojvuthq3ej5@tack.einval.com>

Hi Lukasz,

I'm one of the admins for the Debian wiki, and I've done most of the
anti-spam system that we're using.

On Tue, Apr 24, 2018 at 11:13:24PM -0500, Lukasz Szybalski wrote:
>Hello,
>I have been running a moin moin setup for couple years now( http://
>lucasmanual.com/mywiki/ ) . About 5 years ago I had to block the new user
>signup due to uncontrolled amount of spam, and spam users.
>
>I was hoping to re-enable the registration process but I wanted to know more
>about current moin capabilities for stopping spammers?
>captcha? are you a robot?
>
>
>I know there is a page below but it doesn't really say or provide any
>meaningful copy/paste instructions on how to secure you site on day 1.
>https://moinmo.in/AntiSpamFeatures
>
>I wanted to hear some feedback from people who run public facing moin moin
>example: "debian wiki"? (https://wiki.debian.org/RecentChanges) that does not
>seem to be having any spam at all?

Over a number of years, we have used a few methods in succession to
kill spam, with varying degrees of success:

1. Using recaptcha

   https://salsa.debian.org/debian/moin/blob/master/debian/patches/recaptcha.patch

   This helped for a while, but was not 100% effective. Spammers have
   broken recaptcha via bots or using cheap humans quite a while
   back. It also makes it harder for visually-impaired users, which is
   not ideal. We don't use this now.

2. Forcing validation via email

   https://salsa.debian.org/debian/moin/blob/master/debian/patches/mail-verification.patch

   Lots of spammers will attempt to use garbage email addresses, so
   this helped for a while. Unfortunately, lots of them also operate
   or use mail servers with throwaway accounts so this stopped being
   effective on its own a long time ago. It's still a part of our
   system.

3. External signup verification tool with heuristics

   https://salsa.debian.org/debian/moin/blob/master/debian/patches/external_account_creation_check.patch

   Then I added this as an option. Each time that a user attempts to
   create an account, this passes the source IP address, email address
   and requested username to an external script to run some heuristics
   and get a yes/no answer on whether to create the account (and then
   send the email verification challenge). **The exact script we're
   using is itself not public**, as I don't want to give all the
   details to spammers. However, the ideas are fairly simple:

   * simple syntax checks, e.g. is the email address valid?
   * check for obvious spam sources (networks, email domains) and
     maintain a blacklist for those
   * check for spammy-looking keywords in each of the source domain,
     email address and username parameters, and a set of patterns in
     those parameters that we've seen over time
   * check for known abusers using the www.stopforumspam.com API

   Each of thses steps is scored, and "spamminess" scores that are too
   high will cause the signup to be rejected with an error
   message. Individual scores that are too high, or repeated
   *different* failures from the same IP address (and various other
   things) will also cause IP addresses to be blacklisted. We also ban
   netblocks if there are too many spammy IPs detected - I've written
   analysis tools to help here.

4. (Finally) We've disabled automatic account creation

   Even with all of those checks in place, occasionally we'd see
   spamming scum set up a new clean email domain (etc.) and get in,
   spewing garbage very quickly. What we now do is use the external
   checker to allow us to blacklist attackers quickly, *then* a new
   user is asked to mail us to ask for their email to be whitelisted.
   With a whitelisted address, the external verification script will
   bypass the checks the next time the user tries to sign up, and they
   can create their account as normal.

Blacklisted IPs on the Debian wiki are also blocked from reading the
site. I'm quite happy with that personally, but my co-admin wants to
change it to just block login/account creation. I think that will need
more changes in moin and I've not had the time to do it yet.

A better system would use finer-grained access control: maybe allow
account creation from known-good users/networks but enforce rigid
checks on others. An even better route would be to add moderation by
trusted users; edits from newly-created user accounts need to be
moderated and only after N successful edits will they be allowed to
edit directly.

Even better would be torturing and killing spammers, but apparently
some people think that's wrong.

-- 
Steve McIntyre, Cambridge, UK.                                steve at einval.com
"Every time you use Tcl, God kills a kitten." -- Malcolm Ray