[Python-checkins] peps: Create a new draft of PEP 470

donald.stufft python-checkins at python.org
Fri Oct 3 08:03:40 CEST 2014


https://hg.python.org/peps/rev/5c07588cf42d
changeset:   5570:5c07588cf42d
user:        Donald Stufft <donald at stufft.io>
date:        Fri Oct 03 02:03:34 2014 -0400
summary:
  Create a new draft of PEP 470

* Clarifies the text throughout
* Address comments from the previous threads

files:
  pep-0470.txt |  519 +++++++++++++++++++-------------------
  1 files changed, 258 insertions(+), 261 deletions(-)


diff --git a/pep-0470.txt b/pep-0470.txt
--- a/pep-0470.txt
+++ b/pep-0470.txt
@@ -1,5 +1,5 @@
 PEP: 470
-Title: Using Multi Index Support for External to PyPI Package File Hosting
+Title: Using Multi Repository Support for External to PyPI Package File Hosting
 Version: $Revision$
 Last-Modified: $Date$
 Author: Donald Stufft <donald at stufft.io>,
@@ -9,192 +9,212 @@
 Type: Process
 Content-Type: text/x-rst
 Created: 12-May-2014
-Post-History: 14-May-2014, 05-Jun-2014
+Post-History: 14-May-2014, 05-Jun-2014, 03-Oct-2014
+Replaces: 438
 
 
 Abstract
 ========
 
-This PEP proposes that the official means of having an installer locate and
-find package files which are hosted externally to PyPI become the use of
-multi index support instead of the practice of using external links on the
-simple installer API.
+This PEP proposes a mechanism for project authors to register with PyPI an
+external repository where their project's downloads can be located. This
+information can than be included as part of the simple API so that installers
+can use it to tell users where the item they are attempting to install is
+located and what they need to do to enable this additional repository. In
+addition to adding discovery information to make explicit multiple repositories
+easy to use, this PEP also deprecates and removes the implicit multiple
+repository support which currently functions through directly or indirectly
+linking offsite via the simple API. Finally this PEP also proposes deprecating
+and removing the functionality added by PEP 438, particularly the additional
+rel information and the meta tag to indicate the API version.
 
-It is important to remember that this is **not** about forcing anyone to host
-their files on PyPI. If someone does not wish to do so they will never be under
-any obligation too. They can still list their project in PyPI as an index, and
-the tooling will still allow them to host it elsewhere.
-
-This PEP strictly is concerned with the Simple Installer API and how automated
-installers interact with PyPI, it has no bearing on the informational pages
-which are primarily for human consumption.
+This PEP *does* not propose mandating that all authors upload their projects to
+PyPI in order to exist in the index nor does it propose any change to the human
+facing elements of PyPI.
 
 
 Rationale
 =========
 
-There is a long history documented in PEP 438 that explains why externally
-hosted files exist today in the state that they do on PyPI. For the sake of
-brevity I will not duplicate that and instead urge readers to first take a look
-at PEP 438 for background.
+Historically PyPI did not have any method of hosting files nor any method of
+automatically retrieving installables, it was instead focused on providing a
+central registry of names, to prevent naming collisions, and as a means of
+discovery for finding projects to use. In the course of time setuptools began
+to scrape these human facing pages, as well as pages linked from those pages,
+looking for things it could automatically download and install. Eventually this
+became the "Simple" API which used a similar URL structure however it
+eliminated any of the extraneous links and information to make the API more
+efficient. Additionally PyPI grew the ability for a project to upload release
+files directly to PyPI enabling PyPI to act as a repository in addition to an
+index.
 
-There are currently two primary ways for a project to make itself available
-without directly hosting the package files on PyPI. They can either include
-links to the package files in the simpler installer API or they can publish
-a custom package index which contains their project.
+This gives PyPI two equally important roles that it plays in the Python
+ecosystem, that of index to enable easy discovery of Python projects and
+central repository to enable easy hosting, download, and installation of Python
+projects. Due to the history behind PyPI and the very organic growth it has
+experienced the lines between these two roles are blurry, and this blurriness
+has caused confusion for the end users of both of these roles and this has in
+turn caused ire between people attempting to use PyPI in different capacities,
+most often when end users want to use PyPI as a repository but the author wants
+to use PyPI soley as an index.
 
+By moving to using explict multiple repositories we can make the lines between
+these two roles much more explicit and remove the "hidden" surprises caused
+by the current implementation of handling people who do not want to use PyPI
+as a repository. However simply moving to explicit multiple repositories is
+a regression in discoverablity, and for that reason this PEP adds an extension
+to the current simple API which will enable easy discovery of the specific
+repository that a project can be found in.
 
-Custom Additional Index
------------------------
+PEP 438 attempted to solve this issue by allowing projects to explicitly
+declare if they were using the repository features or not, and if they were
+not, it had the installers classify the links it found as either "internal",
+"verifiable external" or "unverifiable external". PEP 438 was accepted and
+implemented in pip 1.4 (released on Jul 23, 2013) with the final transition
+implemented in pip 1.5 (released on Jan 2, 2014).
 
-Each installer which speaks to PyPI offers a mechanism for the user invoking
-that installer to provide additional custom locations to search for files
-during the dependency resolution phase. For pip these locations can be
-configured per invocation, per shell environment, per requirements file, per
-virtual environment, and per user. The mechanism for specifying additional
-locations have existed within pip and setuptools for many years, by comparison
-the mechanisms in PEP 438 and any other new mechanism will have existed for
-only a short period of time (if they exist at all currently).
+PEP 438 was successful in bringing about more people to utilize PyPI's
+repository features, an altogether good thing given the global CDN powering
+PyPI providing speed ups for a lot of people, however it did so by introducing
+a new point of confusion and pain for both the end users and the authors.
 
-The use of additional indexes instead of external links on the simple
-installer API provides a simple clean interface which is consistent with the
-way most Linux package systems work (apt-get, yum, etc). More importantly it
-works the same even for projects which are commercial or otherwise have their
-access restricted in some form (private networks, password, IP ACLs etc)
-while the external links method only realistically works for projects which
-do not have their access restricted.
 
-Compared to the complex rules which a project must be aware of to prevent
-themselves from being considered unsafely hosted setting up an index is fairly
-trivial and in the simplest case does not require anything more than a
-filesystem and a standard web server such as Nginx or Twisted Web. Even if
-using simple static hosting without autoindexing support, it is still
-straightforward to generate appropriate index pages as static HTML.
+Why Additional Repositories?
+----------------------------
 
-Example Index with Twisted Web
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The two common installer tools, pip and easy_install/setuptools, both support
+the concept of additional locations to search for files to satisify the
+installation requirements and have done so for many years. This means that
+there is no need to "phase" in a new flag or concept and the solution to
+installing a project from a repository other than PyPI will function regardless
+of how old (within reason) the end user's installer is. Not only has this
+concept existed in the Python tooling for some time, but it is a concept that
+exists across languages and even extending to the OS level with OS package
+tools almost universally using multiple repository support making it extremely
+likely that someone is already familar with the concept.
 
-1. Create a root directory for your index, for the purposes of the example
-   I'll assume you've chosen ``/var/www/index.example.com/``.
-2. Inside of this root directory, create a directory for each project such
-   as ``mkdir -p /var/www/index.example.com/{foo,bar,other}/``.
-3. Place the package files for each project in their respective folder,
-   creating paths like ``/var/www/index.example.com/foo/foo-1.0.tar.gz``.
-4. Configure Twisted Web to serve the root directory, ideally with TLS.
+Additionally, the multiple repository approach is a concept that is useful
+outside of the narrow scope of allowing projects which wish to be included on
+the index portion of PyPI but do not wish to utilize the repository portion
+of PyPI. This includes places where a company may wish to host a repository
+that contains their internal packages or where a project may wish to have
+multiple "channels" of releases, such as alpha, beta, release candidate, and
+final release.
+
+Setting up an external repository is very simple, it can be achieved with
+nothing more than a filesystem, some files to host, and any web server capable
+of serving files and generating an automated index of directories (commonly
+called "autoindex"). This can be as simple as:
 
 ::
 
+    $ mkdir -p /var/www/index.example.com/
+    $ mkdir -p /var/www/index.example.com/myproject/
+    $ mv ~/myproject-1.0.tar.gz /var/www/index.example.com/myproject/
     $ twistd -n web --path /var/www/index.example.com/
 
 
-Examples of Additional indexes with pip
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-**Invocation:**
+Using this additional location within pip is also simple and can be included
+on a per invocation, per shell, or per user basis. The pip 6.0 will also
+include the ability to configure this on a per virtual environment or per
+machine basis as well. This can be as simple as:
 
 ::
 
-    $ pip install --extra-index-url https://pypi.example.com/ foobar
+    $ # As a CLI argument
+    $ pip install --extra-index-url https://index.example.com/ myproject
+    $ # As an environment variable
+    $ PIP_EXTRA_INDEX_URL=https://pypi.example.com/ pip install myproject
+    $ # With a configuration file
+    $ echo "[global]\nextra-index-url = https://pypi.example.com/" > ~/.pip/pip.conf
+    $ pip install myproject
 
-**Shell Environment:**
 
-::
+Why Not PEP 438 or Similar?
+---------------------------
 
-    $ export PIP_EXTRA_INDEX_URL=https://pypi.example.com/
-    $ pip install foobar
+While the additional search location support has existed in pip and setuptools
+for quite some time support for PEP 438 has only existed in pip since the 1.4
+version, and still has yet to be implemented in setuptools. The design of
+PEP 438 did mean that users still benefited for projects which did not require
+external files even with older installers, however for projects which *did*
+require external files, users are still silently being given either
+potentionally unreliable or, even worse, unsafe files to download. This system
+is also unique to Python as it arises out of the history of PyPI, this means
+that it is almost certain that this concept will be foreign to most, if not all
+users, until they encounter it while attempting to use the Python toolchain.
 
-**Requirements File:**
+Additionally, the classification system proposed by PEP 438 has, in practice,
+turned out to be extremely confusing to end users, so much so that it is a
+position of this PEP that the situation as it stands is completely untenable.
+The common pattern for a user with this system is to attempt to install a
+project possibly get an error message (or maybe not if the project ever
+uploaded something to PyPI but later switched without removing old files), see
+that the error message suggests ``--allow-external``, they reissue the command
+adding that flag most likely getting another error message, see that this time
+the error message suggests also adding ``--allow-unverified``, and again issue
+the command a third time, this time finally getting the thing they wish to
+install.
 
-::
+This UX failure exists for several reasons.
 
-    $ echo "--extra-index-url https://pypi.example.com/\nfoobar" > requirements.txt
-    $ pip install -r requirements.txt
+1. If pip can locate files at all for a project on the Simple API it will
+   simply use that instead of attempting to locate more. This is generally the
+   right thing to do as attempting to locate more would erase a large part of
+   the benefit of PEP 438. This means that if a project *ever* uploaded
+   a file that matches what the user has requested for install that will be
+   used regardless of how old it is.
 
-**Virtual Environment:**
+2. PEP 438 makes an implicit assumption that most projects would either upload
+   themselves to PyPI or would update themselves to directly linking to release
+   files. While a large number of projects *did* ultimately decide to upload
+   to PyPI, some of them did so only because the UX around what PEP 438 was so
+   bad that they felt forced to do so. More concerning however, is the fact
+   that very few projects have opted to directly and safely link to files and
+   instead they still simply link to pages which must be scraped in order to
+   find the actual files, thus rendering the safe variant
+   (``--allow-external``) largely useless.
 
-::
+3. Even if an author wishes to directly link to their files, doing so safely is
+   non-obvious. It requires the inclusion of a MD5 hash (for historical
+   reasons) in the hash of the URL. If they do not include this then their
+   files will be considered "unverified".
 
-    $ python -m venv myvenv
-    $ echo "[global]\nextra-index-url = https://pypi.example.com/" > myvenv/pip.conf
-    $ myvenv/bin/pip install foobar
+4. PEP 438 takes a security centric view and disallows any form of a global
+   opt in for unverified projects. While this is generally a good thing, it
+   creates extremely verbose and repetive command invocations such as:
 
-**User:**
+   ::
 
-::
+      $ pip install --allow-external myproject --allow-unverified myproject myproject
+      $ pip install --allow-all-external --allow-unverified myproject myproject
 
-    $ echo "[global]\nextra-index-url = https://pypi.example.com/" >~/.pip/pip.conf
-    $ pip install foobar
 
+Multiple Repository/Index Support
+=================================
 
-External Links on the Simple Installer API
-------------------------------------------
+Installers SHOULD implement or continue to offer, the ability to point the
+installer at multiple URL locations. The exact mechanisms for a user to
+indicate they wish to use an additional location is left up to each indidivdual
+implementation.
 
-PEP 438 proposed a system of classifying file links as either internal,
-external, or unsafe. It recommended that by default only internal links would
-be installed by an installer however users could opt into external links on
-either a global or a per package basis. Additionally they could also opt into
-unsafe links on a per package basis.
+Additionally the mechanism discovering an installation candidate when multiple
+repositories are being used is also up to each individual implementation,
+however once configured an implementation should not discourage, warn, or
+otherwise cast a negative light upon the use of a repository simply because it
+is not the default repository.
 
-This system has turned out to be *extremely* unfriendly towards the end users
-and it is the position of this PEP that the situation has become untenable. The
-situation as provided by PEP 438 requires an end user to be aware not only of
-the difference between internal, external, and unsafe, but also to be aware of
-what hosting mode the package they are trying to install is in, what links are
-available on that project's /simple/ page, whether or not those links have
-a properly formatted hash fragment, and what links are available from pages
-linked to from that project's /simple/ page.
+Currently both pip and setuptools implement multiple repository support by
+using the best installation candidate it can find from either repository,
+essentially treating it as if it were one large repository.
 
-There are a number of common confusion/pain points with this system that I
-have witnessed:
+Installers SHOULD also implement some mechanism for removing or otherwise
+disabling use of the default repository. The exact specifics of how that is
+achieved is up to each indidivdual implementation.
 
-* Users unaware what the simple installer api is at all or how an installer
-  locates installable files.
-* Users unaware that even if the simple api links to a file, if it does
-  not include a ``#md5=...`` fragment that it will be counted as unsafe.
-* Users unaware that an installer can look at pages linked from the
-  simple api to determine additional links, or that any links found in this
-  fashion are considered unsafe.
-* Users are unaware and often surprised that PyPI supports hosting your files
-  someplace other than PyPI at all.
-
-In addition to that, the information that an installer is able to provide
-when an installation fails is pretty minimal. We are able to detect if there
-are externally hosted files directly linked from the simple installer api,
-however we cannot detect if there are files hosted on a linked page without
-fetching that page and doing so would cause a massive performance hit just to
-see if there might be a file there so that a better error message could be
-provided.
-
-Finally very few projects have properly linked to their external files so that
-they can be safely downloaded and verified. At the time of this writing there
-are a total of 65 projects which have files that are only available externally
-and are safely hosted.
-
-The end result of all of this, is that with PEP 438, when a user attempts to
-install a file that is not hosted on PyPI typically the steps they follow are:
-
-1. First, they attempt to install it normally, using ``pip install foobar``.
-   This fails because the file is not hosted on PyPI and PEP 438 has us default
-   to only hosted on PyPI. If pip detected any externally hosted files or other
-   pages that we *could* have attempted to find other files at it will give an
-   error message suggesting that they try ``--allow-external foobar``.
-2. They then attempt to install their package using
-   ``pip install --allow-external foobar foobar``. If they are lucky foobar is
-   one of the packages which is hosted externally and safely and this will
-   succeed. If they are unlucky they will get a different error message
-   suggesting that they *also* try ``--allow-unverified foobar``.
-3. They then attempt to install their package using
-   ``pip install --allow-external foobar --allow-unverified foobar foobar``
-   and this finally works.
-
-This is the same basic steps that practically everyone goes through every time
-they try to install something that is not hosted on PyPI. If they are lucky it'll
-only take them two steps, but typically it requires three steps. Worse there is
-no real indication to these people why one package might install after two
-but most require three. Even worse than that most of them will never get an
-externally hosted package that does not take three steps, so they will be
-increasingly annoyed and frustrated at the intermediate step and will likely
-eventually just start skipping it.
+End users wishing to limit what files they pull from which repository can
+simply use `devpi <http://doc.devpi.net/latest/>`_ to whitelist projects from
+PyPI or another repository.
 
 
 External Index Discovery
@@ -208,24 +228,44 @@
 
 To support projects that wish to externally host their files and to enable
 users to easily discover what additional indexes are required, PyPI will gain
-the ability for projects to register external index URLs and additionally an
+the ability for projects to register external index URLs along with an
 associated comment for each. These URLs will be made available on the simple
 page however they will not be linked or provided in a form that older
 installers will automatically search them.
 
+This ability will take the form of a ``<meta>`` tag. The name of this tag must
+be set to ``external-repository`` and the content will be a link to the location
+of the external repository. An optional data-description attribute will convey
+any comments or description that the author has provided.
+
+An example would look something like:
+
+::
+
+    <meta name="external-repository" content="https://index.example.com/" data-description="Primary Repository">
+    <meta name="external-repository" content="https://index.example.com/Ubuntu-14.04/" data-description="Wheels built for Ubuntu 14.04">
+
+
+When an external repository is added to a project, new uploads will no longer
+be permitted to that project. However any existing files will simply be hidden
+from the simple API and the web interface until all of the external repositories
+are removed, in which case they will be visible again. PyPI MUST warn authors
+if adding an external repository will hide files and that warning must persist
+on any of the project management pages for that particular project.
+
 When an installer fetches the simple page for a project, if it finds this
 additional meta-data and it cannot find any files for that project in it's
 configured URLs then it should use this data to tell the user how to add one
 or more of the additional URLs to search in. This message should include any
 comments that the project has included to enable them to communicate to the
-user and provide hints as to which URL they might want if some are only
-useful or compatible with certain platforms or situations. When the installer
+user and provide hints as to which URL they might want (e.g. if some are only
+useful or compatible with certain platforms or situations). When the installer
 has implemented the auto discovery mechanisms they should also deprecate any
 of the mechanisms added for PEP 438 (such as ``--allow-external``) for removal
 at the end of the deprecation period proposed by the PEP.
 
 This feature *must* be added to PyPI prior to starting the deprecation and
-removal process for link spidering.
+removal process for the implicit offsite hosting functionality.
 
 
 Deprecation and Removal of Link Spidering
@@ -278,20 +318,6 @@
 ``--allow-unverified`` in pip.
 
 
-PIL
----
-
-It's obvious from the numbers below that the vast bulk of the impact come from
-the PIL project. On 2014-05-17 an email was sent to the contact for PIL
-inquiring whether or not they would be willing to upload to PyPI. A response
-has not been received as of yet (2014-06-05) nor has any change in the hosting
-happened. Due to the popularity of PIL this PEP also proposes that during the
-deprecation period that PyPI Administrators will set the PIL download URL as
-the external index for that project. Allowing the users of PIL to take
-advantage of the auto discovery mechanisms although the project has seemingly
-become unmaintained.
-
-
 Impact
 ======
 
@@ -300,12 +326,16 @@
 projects it's unlikely that a maintainer will arrive to set the external index
 metadata which would allow the auto discovery mechanism to find it.
 
-Looking at the numbers factoring out PIL (which has been special cased above)
-the actual impact should be quite low, with it affecting just 6.9% of projects
-which host only externally or 2.8% which have their latest version hosted
-externally. This represents a mere 3883 unique IP addresses. The break down of
-this is that of those 3883 addresses, 100% of them installed something that
-could not be verified while only 3% installed something which could be.
+Looking at the numbers factoring out PIL (which has been special cased below)
+the actual impact should be quite low, with it affecting just 3.8% of projects
+which host any files only externally or 2.2% which have their latest version
+hosted only externally.
+
+6674 unique IP addresses have accessed the Simple API for these 3.8% of
+projects in a single day (2014-09-30). Of those, 99.5% of them installed
+something which could not be verified, and thus they were open to a Remote Code
+Execution via a Man-In-The-Middle attack, while 7.9% installed something which
+could be verified and only 0.4% only installed things which could be verified.
 
 
 Projects Which Rely on Externally Hosted files
@@ -320,9 +350,9 @@
 ============ ======= ================ =================== =======
 \             PyPI    External (old)   External (latest)   Total
 ============ ======= ================ =================== =======
- **Safe**     38716   31               35                  38782
- **Unsafe**   0       1659             1169                2828
- **Total**    38716   1690             1204                41610
+ **Safe**     43313   16               39                  43368
+ **Unsafe**   0       756              1092                1848
+ **Total**    43313   772              1131                45216
 ============ ======= ================ =================== =======
 
 
@@ -331,21 +361,22 @@
 
 This is determined by looking at the number of requests the
 ``/simple/<project>/`` page had gotten in a single day. The total number of
-requests during that day was 17,960,467.
+requests during that day was 10,623,831.
 
 ============================== ========
 Project                        Requests
 ============================== ========
-PIL                            13470
-mysql-connector-python         321
-salesforce-python-toolkit      54
-pyodbc                         50
-elementtree                    44
-atfork                         39
-RBTools                        29
-django-contrib-requestprovider 28
-wadofstuff-django-serializers  23
-Pygame                         21
+PIL                            63869
+Pygame                         2681
+mysql-connector-python         1562
+pyodbc                         724
+elementtree                    635
+salesforce-python-toolkit      316
+wxPython                       295
+PyXML                          251
+RBTools                        235
+python-graph-core              123
+cElementTree                   121
 ============================== ========
 
 
@@ -354,25 +385,38 @@
 
 This is determined by looking at the IP addresses of requests the
 ``/simple/<project>/`` page had gotten in a single day. The total number of
-unique IP addresses during that day was 105,587.
+unique IP addresses during that day was 124,604.
 
 ============================== ==========
 Project                        Unique IPs
 ============================== ==========
-PIL                            3515
-mysql-connector-python         117
-pyodbc                         34
-elementtree                    21
-RBTools                        19
-egenix-mx-base                 16
-Pygame                         14
-salesforce-python-toolkit      13
-django-contrib-requestprovider 12
-wxPython                       11
-python-apt                     10
+PIL                            4553
+mysql-connector-python         462
+Pygame                         202
+pyodbc                         181
+elementtree                    166
+wxPython                       126
+RBTools                        114
+PyXML                          87
+salesforce-python-toolkit      76
+pyDes                          76
 ============================== ==========
 
 
+PIL
+---
+
+It's obvious from the numbers above that the vast bulk of the impact come from
+the PIL project. On 2014-05-17 an email was sent to the contact for PIL
+inquiring whether or not they would be willing to upload to PyPI. A response
+has not been received as of yet (2014-10-03) nor has any change in the hosting
+happened. Due to the popularity of PIL this PEP also proposes that during the
+deprecation period that PyPI Administrators will set the PIL download URL as
+the external index for that project. Allowing the users of PIL to take
+advantage of the auto discovery mechanisms although the project has seemingly
+become unmaintained.
+
+
 Rejected Proposals
 ==================
 
@@ -395,80 +439,33 @@
 
 These proposals are rejected because:
 
-* The classification "system" is complex, hard to explain, and requires an
-  intimate knowledge of how the simple API works in order to be able to reason
-  about which classification is required. This is reflected in the fact that
-  the code to implement it is complicated and hard to understand as well.
+* The classification system introduced in PEP 438 in an entirely unique concept
+  to PyPI which is not generically applicable even in the context of Python
+  packaging. Adding additional concepts comes at a cost.
 
-* People are generally surprised that PyPI allows externally linking to files
-  and doesn't require people to host on PyPI. In contrast most of them are
-  familiar with the concept of multiple software repositories such as is in
-  use by many OSs.
+* The classification system itself is non-obvious to explain and to
+  pre-determine what classification of link a project will require entails
+  inspecting the project's ``/simple/<project>/`` page, and possibly any
+  URLs linked from that page.
 
-* PyPI is fronted by a globally distributed CDN which has improved the
-  reliability and speed for end users. It is unlikely that any particular
-  external host has something comparable. This can lead to extremely bad
-  performance for end users when the external host is located in different
-  parts of the world or does not generally have good connectivity.
+* The ability to host externally while still being linked for automatic
+  discovery is mostly a historic relic which causes a fair amount of pain and
+  complexity for little reward.
 
-  As a data point, many users reported sub DSL speeds and latency when
-  accessing PyPI from parts of Europe and Asia prior to the use of the CDN.
+* The installer's ability to optimize or clean up the user interface is limited
+  due to the nature of the implicit link scraping which would need to be done.
+  This extends to the ``--allow-*`` options as well as the inability to
+  determine if a link is expected to fail or not.
 
-* PyPI has monitoring and an on-call rotation of sysadmins whom can respond to
-  downtime quickly, thus enabling a quicker response to downtime. Again it is
-  unlikely that any particular external host will have this. This can lead
-  to single packages in a dependency chain being un-installable. This will
-  often confuse users, who often times have no idea that this package relies
-  on an external host, and they cannot figure out why PyPI appears to be up
-  but the installer cannot find a package.
-
-* PyPI supports mirroring, both for private organizations and public mirrors.
-  The legal terms of uploading to PyPI ensure that mirror operators, both
-  public and private, have the right to distribute the software found on PyPI.
-  However software that is hosted externally does not have this, causing
-  private organizations to need to investigate each package individually and
-  manually to determine if the license allows them to mirror it.
-
-  For public mirrors this essentially means that these externally hosted
-  packages *cannot* be reasonably mirrored. This is particularly troublesome
-  in countries such as China where the bandwidth to outside of China is
-  highly congested making a mirror within China often times a massively better
-  experience.
-
-* Installers have no method to determine if they should expect any particular
-  URL to be available or not. It is not unusual for the simple API to reference
-  old packages and URLs which have long since stopped working. This causes
-  installers to have to assume that it is OK for any particular URL to not be
-  accessible. This causes problems where an URL is temporarily down or
-  otherwise unavailable (a common cause of this is using a copy of Python
-  linked against a really ancient copy of OpenSSL which is unable to verify
-  the SSL certificate on PyPI) but it *should* be expected to be up. In this
-  case installers will typically silently ignore this URL and later the user
-  will get a confusing error stating that the installer couldn't find any
-  versions instead of getting the real error message indicating that the URL
-  was unavailable.
-
-* In the long run, global opt in flags like ``--allow-all-external`` will
-  become little annoyances that developers cargo cult around in order to make
-  their installer work. When they run into a project that requires it they
-  will most likely simply add it to their configuration file for that installer
-  and continue on with whatever they were actually trying to do. This will
-  continue until they try to install their requirements on another computer
-  or attempt to deploy to a server where their install will fail again until
-  they add the "make it work" flag in their configuration file.
-
-* The URL classification only works for a certain subset of projects, however
-  it does not allow for any project which needs additional restrictions such
-  as Access Controls. This means that there would be two methods of doing the
-  same thing, linking to a file safely and hosting an index. Hosting an index
-  works in all situations and by relying on this we make for a more consistent
-  experience no matter the reason for external hosting.
-
-* The safe external hosting option hampers the ability of PyPI to upgrade it's
-  security infrastructure. For instance if MD5 becomes broken in the future
-  there will be no way for PyPI to upgrade the hashes of the projects which
-  rely on safe external hosting via MD5 while files that are hosted on PyPI
-  can simply be processed over with a new hash function.
+* The mechanism paints a very broad brush when enabling an option, while PEP
+  438 attempts to limit this with per package options. However a project that
+  has existed for an extended period of time may often times have several
+  different URLs listed in their simple index. It is not unsusual for at least
+  one of these to no longer be under control of the project. While an
+  unregistered domain will sit there relatively harmless most of the time, pip
+  will continue to attempt to install from it on every discovery phase. This
+  means that an attacker simply needs to look at projects which rely on unsafe
+  external URLs and register expired domains to attack users.
 
 Copyright
 =========

-- 
Repository URL: https://hg.python.org/peps


More information about the Python-checkins mailing list