[Catalog-sig] pre-PEP: transition to release-file hosting at pypi site

Robert Collins robertc at robertcollins.net
Tue Mar 12 19:43:06 CET 2013


On 13 March 2013 07:18, Carl Meyer <carl at oddbird.net> wrote:
> It seems to me that there's a remarkable level of consensus developing
> here (though it may not look like it), and a small set of remaining open
> questions.
>
> The consensus (as I see it):

I think that is a fair summary.

One thing I'd like to mention, that I don't recall seeing so far is
that PyPI is *really slow*. I don't mean 'the pypi web host is on a
bad link' - far from it.

pip, and I presume setuptools, spider to check dependencies and do the
external HTML scraping and so forth.

This takes an age when each new web host to talk to is a new DNS
lookup (say 0.3 seconds) + HTTP request (0.6 seconds) with possible
HTTPS setup in there too (up to 1.2 seconds). A project with dozens of
dependencies in it's transitive dependency graph may take minutes
*just spidering*.

Now, if you read those figures and go 'zomg thats slow' - well yes,
light speed isn't that fast - and even then  while much of
round-the-globe traffic is at light speed, a considerable chunk of
time isn't.

Moving all releases to one HTTPS host (and ensuring persistent
connections are used for repeated index queries) [and then drop to
HTTP for release files so they can be squid cached] is the simplest
short term solution to this, and I'm *really* excited to see it being
tackled.

Longer term I'd love to see PyPI offer an API to return transitive
data, to avoid the spidering altogether.

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Cloud Services


More information about the Catalog-SIG mailing list