[Catalog-sig] High-availability for PyPI, mirroring infrastructures?

Philipp von Weitershausen philipp at weitershausen.de
Mon Aug 11 12:03:32 CEST 2008


Tarek Ziade wrote:
> 2008/8/11 Andreas Jung <lists at zopyx.com <mailto:lists at zopyx.com>>
> 
>     Hi there,
> 
>     Python eggs and zc.buildout are playing a major role in Zope world -
>     both for development and deployment. PyPI right now is apparently a
>     single-point-of-failure. Although the availability of PyPI become
>     much better over time, the complete infrastructure is not highly
>     available which is crucial when you are doing commercial development.
> 
>     What is the perspective for addressing this issue? I have seen that
>     Ingeniweb maintains/maintained a PyPI mirror (does not seem to be
>     up2date).
>     What is the current recommended way for building a (private) mirror?
> 
> 
> Hi Andreas
> 
> to do our mirror, we use iw.eggproxy. It is a simple web proxy that 
> grabs files over pypi upon requests.
> every file is then kept locally for the next call.
> 
> I know Zope Corp has another script that generates a local copy by 
> scanning PyPI through XML-RPC
> but it is a full copy.

Yes, zc.mirrorpypislashsimple creates a full copy in the sense that it 
mirrors the pages of *every* package. However, it does not mirror the 
actual download archives. They are still retrieved from python.org.

I'm sure that it could be extended to download those as well, though.

>     How big (in GB) is currently PI? If a company would be interested to
>     provide a mirroring server, how much diskspace (and bandwidth) would
>     be needed?
> 
> 
> Around 5 gigas iirc

By extending zc.mirrorpypislashsimple to also download archives one 
could just try ;).



More information about the Catalog-SIG mailing list