[Moin-user] Re: [MayaVi-users] Re: MayaVi Wiki vandalism

Bob Apthorpe apthorpe+moin at cynistar.net
Wed Nov 24 22:23:02 EST 2004


Hi,

Sebastian Haase wrote:
> Hi all, (I cc'ed this to moinmoin)
> I would expect the "moinmoin community" would/should have a rather generic 
> solution for this.  Requiring logins might be an easy fix, that we 
> (assumingly) could all live with. (How hard is this to install !?)
> To go back in the "revision history" for each page to a point that does not 
> contain spam words  sounds like a "fun" script to write - ONCE-AND-FOR-ALL - 
> and then it could be shipped with moinmoin.

FWIW, I built a perl module called Text::SpamAssassin that takes 
arbitrary text and metadata (for example, IP address, username, post 
subject, etc.), formats it as a RFC-822-compliant mail message and feeds 
that into a specially-tuned instance of Mail::SpamAssassassin for analysis.

The upshot is that you get to leverage DNSBL and SURBL (www.surbl.org - 
blacklists frequently-spammed URLs) tests as well as all of 
SpamAssassin's rule-based tests without reinventing the wheel badly. 
I've built the perl module, a wrapper script for simple command-line use 
(babycart - needs daemonizing), a sample SpamAssassin config file, and a 
fragment of PHP code to integrate it with the Wordpress blog so the 
moderation flag is set if a post appears to be spam.

I know beans about Python but I have no doubt that a moderately-skilled 
programmer could figure out how to call babycart from within MoinMoin, 
etc. In this case it might be best to flag a revision as being spammy 
and provide an alert to the page owner and a means (link) to revert the 
change.

I haven't posted Text::SpamAssassin to CPAN because I think the module 
interface needs review and cleaning. I'm an old-skool procedural 
programmer so I view all my attempts at object design with a lot of 
skepticism. Better to let someone more skilled in OOD destupidify my 
interfaces before making an ass of myself on CPAN (oh, and pulling the 
rug out from under all the application programmers who used rev 0.01's 
interface when it all gets changed for rev 0.02. Been there, done that, 
don't want to do that to anyone if I can avoid it.)

The code is at 
http://www.austinimprov.com/~apthorpe/code/babycart/Text-SpamAssassin-1.2.tar.gz 
in case anyone wants to take it for a test drive. Consider it to be 
distributed under the Artistic License. Oh, one last thing - caveat 
utilitor. Bug reports, constructive criticism, suggestions and great 
wads of cash cheerfully accepted!

hth,

-- Bob

PS: A scary, er, motivated individual would find a way to tokenize the 
wiki content so SA's Bayesian analyzer could be used.

> On Wednesday 24 November 2004 03:48 am, Vincent Picavet wrote:
> 
>>Hi all,
>>I am only a intermittent reader of this ML, but I saw your problem about
>>the wiki.
>>I had a similar problem with a wiki I am responsible of. I think robots
>>have been recently developed by spammers to automatically insert links
>>to their drug/porn/... sites on wikis, in order to improve their
>>rank in the search engines.
>>The good news is that most of the time wiki engines save the history of
>>the pages. It usually needs three clicks to revert a page to the
>>precedent state (as you say it, in the MoinMoin's case, click the info
>>icon on top right corner of the page, then view a correct previous, edit
>>it and save).
>>The bad news is that in my case, the attackers came back a lot of time
>>and I had to request login to be able to modify a page.
>>Needless to say that there is nothing legal to do to prevent such
>>attacks, they often come from russia or china or some other places where
>>a mail to abuse at xxx has no effect...
>>Good luck to get your wiki back.
>>Vincent
>>
>>
>>>>Forget it, I have monthly backups of the site.  I'll see what I can
>>>>do.  We might loose recent content though but this is easier than
>>>>going through each page and fixing content.
>>>
>>>I have the idea that moin allows the administrator to revert a wiki to
>>>a previous version. You can see which versions are stored by clicking
>>>on the info icon and in fact the vandalism started some time ago.




More information about the Moin-user mailing list