[Catalog-sig] PyPI reverse download

M.-A. Lemburg mal at egenix.com
Wed Jul 28 00:06:11 CEST 2010


"Martin v. Löwis" wrote:
> Am 27.07.2010 22:46, schrieb M.-A. Lemburg:
>> "Martin v. Löwis" wrote:
>>> I'll be implementing a feature for PyPI where you can POST
>>> to a certain action (revdownload), and then PyPI will POST
>>> the file requested to an URL that was passed; this is need
>>> to make blobs work on AppEngine.
>>>
>>> Any objections?
>>
>> Could you provide more detail on how this would work and why
>> this is needed for AppEngine ?
> 
> Ok, here is the long story.
> 
> First, I tried to use the approach of pypione, using blobs for
> distributions. That won't work because blobs are limited to 1MB.
> 
> Then I tried using lists of blobs instead. That won't work because
> the HTTP response size in urlfetch is limited to 10GB.
> 
> Then I tried using Range: headers to mirror large files in pieces.
> That won't work because I then wouldn't be able to serve the files
> to setuptools, unless that would also start to use Range: headers.
> 
> The only way to serve files larger than 10GB is the blobstore.
> 
> However, apps can neither read from nor write to the blobstore.
> The only way to read from it is to serve the file, and the only
> way to write to it is through a POST from the outside.
> 
> BTW, Google has kindly granted the app access to the blobstore
> (which is a for-fees feature only), and also kindly increased
> the store quota (which is 1GB in the free service, when a PyPI
> mirror needs about 15GB).

Thanks for the details.

One aspect I still don't understand is why you'd want to upload
the whole PyPI mirror image in one go. Wouldn't it be better
to just upload the distribution files separately ? (I don't
think any of those is more than a few 10MB in size)

Another aspect I don't (yet) understand is why these uploads
would have to be initiated from outside the main PyPI server.

I suspect that you want to use this feature to sync an
AppEngine mirror with the data on the main server. For that,
you'd only need to be able to upload data from that one
server to the AppEngine blobstore. This should be possible
without any external request to a PyPI RPC interface, simply
via a script run via a cronjob or perhaps triggered by
a new distribution file upload.

The Amazon Cloudfront mirror would essentially work in the same
way.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 27 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


More information about the Catalog-SIG mailing list