[Catalog-sig] why is the wiki being hit so hard?

"Martin v. Löwis" martin at v.loewis.de
Sun Aug 5 10:26:13 CEST 2007


> pardon for this completely useless quoting of irrelevant text
> but I tried just telling catalog-sig to go read this url
> http://search.msn.com.my/docs/siteowner.aspx?t=SEARCH_WEBMASTER_FAQ_MSNBotIndexing.htm&FORM=WFDD#D
> and check MSNbot is crawling my site too frequently.

msnbot is currently locked out entirely from crawling the wiki,
not by robots.txt, but by giving 403 for the IPs it comes from.

I have now added a robots.txt with a crawl-speed of 20. IIUC,
this requests that crawlers should access the site not more
often than once every 20s. I then unblocked Yahoo! Slurp
and msnbot.

Regards,
Martin


More information about the Catalog-SIG mailing list