[Catalog-sig] PyPI overloaded(?)

Jim Fulton jim at zope.com
Wed Oct 18 13:20:05 CEST 2006


Richard Jones wrote:
> Sorry I didn't respond in a more immediate manner - I'm quite busy with work 
> and organising papers for OSDC '06.
> 
> On Wednesday 18 October 2006 04:08, Martin v. Löwis wrote:
>> I'm not so sure this is the definite answer. If the system is
>> overloaded, it might be because it consumes too many resources itself
>> (in which case mirroring wouldn't help), or because something else
>> on the machine is consuming too many resources (in which case the
>> installation should be moved elsewhere entirely).
> 
> We still have the problem that the PyPI browse interface is quite 
> CPU-intensive and if it's hit by a bot it'll definitely impact on overall 
> system performance.

Is that what happened the other day?

> We have a check in the browse code to see if the user agent matches:
> 
> botre = re.compile(r'^$|brains|yeti|myie2|findlinks|ia_archiver|psycheclone|
> badass|crawler|slurp|spider|bot|scooter|infoseek|looksmart|jeeves', re.I)
> 
> and if it does then the browse returns an empty page. This RE is pretty 
> complete - I use it to redirect bots to a dedicated ZEO client at work.
> 
> I've added a robots.txt to http://cheeseshop.python.org (I always meant to, 
> but never got around to it). Unfortunately, I'm not 
> sure "Disallow: /pypi?:action=browse" will be handled properly. We'll see.

Is there something somewhere that documents the architecture?
I'd like to try to offer helpful suggestions, but without more knowledge
of what's going on, that's hard to do.

Jim

-- 
Jim Fulton           mailto:jim at zope.com       Python Powered!
CTO                  (540) 361-1714            http://www.python.org
Zope Corporation     http://www.zope.com       http://www.zope.org


More information about the Catalog-sig mailing list