[Catalog-sig] PyPI overloaded(?)
Richard Jones
richardjones at optushome.com.au
Wed Oct 18 06:43:49 CEST 2006
Sorry I didn't respond in a more immediate manner - I'm quite busy with work
and organising papers for OSDC '06.
On Wednesday 18 October 2006 04:08, Martin v. Löwis wrote:
> I'm not so sure this is the definite answer. If the system is
> overloaded, it might be because it consumes too many resources itself
> (in which case mirroring wouldn't help), or because something else
> on the machine is consuming too many resources (in which case the
> installation should be moved elsewhere entirely).
We still have the problem that the PyPI browse interface is quite
CPU-intensive and if it's hit by a bot it'll definitely impact on overall
system performance.
We have a check in the browse code to see if the user agent matches:
botre = re.compile(r'^$|brains|yeti|myie2|findlinks|ia_archiver|psycheclone|
badass|crawler|slurp|spider|bot|scooter|infoseek|looksmart|jeeves', re.I)
and if it does then the browse returns an empty page. This RE is pretty
complete - I use it to redirect bots to a dedicated ZEO client at work.
I've added a robots.txt to http://cheeseshop.python.org (I always meant to,
but never got around to it). Unfortunately, I'm not
sure "Disallow: /pypi?:action=browse" will be handled properly. We'll see.
Richard
More information about the Catalog-sig
mailing list