[SciPy-dev] Server spam problems spam spam: spam

Pauli Virtanen pav at iki.fi
Mon Feb 23 20:46:03 EST 2009


Sun, 22 Feb 2009 13:40:20 -0800, Michael Abshoff wrote:
[clip]
> two tips of fighting spammers from the Sage project's wiki:
> 
>   * add a list of common Chinese words to LocalBadContent, i.e.
> 
> http://wiki.sagemath.org/LocalBadContent
> 
> Also make sure to clean out all the spammer attempts on the hard disk.
> I.e I deleted 6,000 directories in "pages" of the Cython wiki since Spam
> attempts are preserved and not actually deleted from disk. If you have a
> couple ten thousand of those in one directory this might make every wiki
> access painfully slow and impact the whole server.

Continuing Gael's work, I tried to expand the LocalBadContent list:

	http://scipy.org/LocalBadContent

I wonder how useful this turns out to be in the end, this smells like an 
arms race... I doubt the additions cause problems to real pages, but if 
they do, some of them need to be reverted.

[Btw, shouldn't LocalBadContent editing be restricted to those in 
EditorGroup? And could my account PauliVirtanen be added in the group?]

Another thing is that there are apparently ca. 11600 pages in the 
Scipy.org wiki. I'd make a wild guess that at most ~500 of these are 
valid content; the rest is spam. I'm not sure if getting rid of the spam 
pages improves Moin's performance. 

Do we have any valid pages with CJK characters? Much of the spam seems 
Chinese, so mass-deleting at least this portion of it shouldn't be 
impossible to do, given Moin's database format.

-- 
Pauli Virtanen




More information about the SciPy-Dev mailing list