[Catalog-sig] PyPI limitations (was: Deprecate External Links)

Noah Kantrowitz noah at coderanger.net
Thu Feb 28 18:41:51 CET 2013


On Feb 28, 2013, at 2:14 AM, M.-A. Lemburg wrote:

> On 27.02.2013 19:11, Noah Kantrowitz wrote:
>> 
>> On Feb 27, 2013, at 9:28 AM, M.-A. Lemburg wrote:
>> 
>>> On 27.02.2013 18:05, Noah Kantrowitz wrote:
>>>> 
>>>> 
>>>> "M.-A. Lemburg" <mal at egenix.com> wrote:
>>>>>> I propose we deprecate the external links that PyPI has published
>>>>>> on the /simple/ indexes which exist because of the history of PyPI.
>>>>>> Ideally in some number of months (1? 2?) we would turn off adding
>>>>>> these links from new releases, leaving the existing ones intact and
>>>>>> then a few months later the existing links be removed completely.
>>>>> 
>>>>> -1.
>>>>> 
>>>>> There are many reasons for not hosting packages and distributions
>>>>> on PyPI itself.
>>>>> 
>>>> 
>>>> [citation needed]
>>> 
>>> We've been through this discussion a couple of times in the past.
>>> I'm sure the reasons will get listed again in this discussion :-)
>>> 
>>> Too many distribution files for PyPI to handle,
>> 
>> Again, please point at a specific package. I wasn't aware that PyPI limited uploads at all, but if it does we can certainly increase the number if there is a good reason.
> 
> PyPI limits the size of the distribution files (at 40MB),
> but it doesn't limit the number of distribution files.
> 
> However, taking our egenix-mx-base package as example, we have
> 120 distribution files for every single release. Uploading those
> to PyPI would not only take long, but also quickly get the
> PyPI storage requirements up to a few TB if just a few package
> authors start to do the same.
> 

I've got 1.5TB of available space on the cluster, with 2TB on the OSL file server and 8TB worth of iSCSI being racked some time in the next few months. Consider this not a problem :)

>>> no support for
>>> UCS2/UCS4 binary distributions, unsupported distribution file
>>> formats (e.g. our prebuilt format),
>> 
>> Not sure why PyPI would even care what charset the package files use, but if true thats certainly a bug and we can get that fixed. What file formats do pip/buildout support that PyPI doesn't support for uploads?
> 
> Not the charset of the package files :-) I'm talking about binary
> files for Python UCS2 vs. UCS4 builds. You have to ship both
> variants for Unix platforms.

Okay, does that not work or are you just pointing it out as an annoyance?

> 
> Regarding file formats: PyPI applies a number of checks for
> the supported file formats which not only check the extension,
> but also look inside the files to only accept a certain number
> of formats.
> 
> See https://bitbucket.org/loewis/pypi/src/9863fa859e4b/verify_filetype.py?at=default
> for details.
> 
> I was under the impression that this would filter out our
> prebuilt format, but I just tried an upload and it does seem
> to pass the tests, so I have to correct the above - our
> prebuilt format is supported by PyPI (hey, one problem less
> to worry about ;-)).
> 
> About the prebuilt format:
> 
> We created the prebuilt binary package
> format a while ago to overcome issues with eggs not being
> flexible enough and not carrying enough information to differentiate
> between e.g. UCS2/UCS4 build of Python or properly identifying
> platforms.
> 
> The format works with easy_install and pip, because the interface
> is the same as for sdist files: you unzip the archive, run
> "python setup.py ...commands..." and you're done.

If there is a format that one of the install tools (pip, buildout, easy_install, ?) supports that PyPI blocks uploads for, we should add it. As Martin mentioned the file type filters are just for spam prevention, and as such need to be a whitelist, but it is quite easy to add new formats as needed.

--Noah


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20130228/2bea1772/attachment.pgp>


More information about the Catalog-SIG mailing list