[Python-Dev] Re: What to do about the Wiki?
Guido van Rossum
guido@python.org
Wed, 31 Jul 2002 12:16:56 -0400
> Guido> Juergen Hermann, Moinmoin's author, said he fixed a few things,
> Guido> but also said that Moinmoin is essentially vulnerable to
> Guido> "recursive wget" (e.g. someone trying to suck up the entire Wiki
> Guido> by following links). Apparently this is what brought the site
> Guido> down this weekend -- if I understand correctly, an in-memory log
> Guido> was growing too fast.
>
> I'm a bit confused by these statements. MoinMoin is a CGI script. I don't
> understand where "recursive wget" and "in-memory log" would come into play.
> I recently fired up two Wikis on the Mojam server. I never see any
> long-running process which would suggest there's an in-memory log which
> could grow without bound. The MoinMoin package does generate HTTP
> redirects, but while they might coax wget into firing off another request,
> it should be handled by a separate MoinMoin process on the server side. You
> should see the load grow significantly as the requests pour in, but
> shouldn't see any one MoinMoin process gobbling up all sorts of resources.
> Jürgen, can you elaborate on these themes a little more?
Juergen seems offline or too busy to respond. Here's what he wrote on
the matter. I guess he's reading the entire log into memory and
updating it there.
| Subject: [Pydotorg] wiki
| From: Juergen Hermann <jh@web.de>
| To: "pydotorg@python.org" <pydotorg@python.org>
| Date: Mon, 29 Jul 2002 20:32:31 +0200
| Hi!
|
| I looked into the wiki, and two things killed us:
|
| a) apart from google hits, some $!&%$""$% did a recursive wget. And the
| wiki spans a rather wide uri space...
|
| b) the event log grows much faster than I'm used to, thus some
| "simple" algorithms don't hold for this size.
|
|
| Solutions:
|
| a) I just updated the wiki software, the current cvs contains a
| robot/wget filter that forbids any access except to "view page" URIs
| (i.e. we remain open to google, but no more open than absolutely
| needed). If need be, we can forbid access altogether, or only allow
| google.
|
| b) I'll install a cron job that rotates the logs, to keep them short.
|
| I shortened the logs manually for now. So if you all agree, we could
| activate the wiki again.
|
|
| Ciao, Jürgen
Reading this again, I think we should give it a try again.
> Guido> I believe that Juergen has fixed the log-growing problem. Should
> Guido> we enable the Wiki again and hope for the best?
>
> With an XS4ALL person at the ready? Perhaps someone can keep a window open
> on creosote running something like
>
> while true ; do
> ps auxww | egrep python | sort -r -n -k 5,5 | head -1
> sleep 15
> done
>
> I'm running out for the next few hours. I'll be happy to run the while loop
> when I return.
We'll watch it here. I know who to write to have it rebooted.
--Guido van Rossum (home page: http://www.python.org/~guido/)