[Distutils] PEP 470, round 4 - Using Multi Repository Support for External to PyPI Package File Hosting

Donald Stufft donald at stufft.io
Fri Oct 3 21:08:26 CEST 2014


> On Oct 3, 2014, at 2:28 PM, holger krekel <holger at merlinux.eu> wrote:
> 
> On Sat, Oct 04, 2014 at 00:24 +1000, Nick Coghlan wrote:
>> On 3 October 2014 22:02, Donald Stufft <donald at stufft.io> wrote:
>> 
>>> As far as simplication goes, I don't believe it simplifies the implementation
>>> of PyPI at all, it just shuffles things around and creates work on my part
>>> in order to get PyPI supporting the new stuff. It does however let installers
>>> become simpler and it enables installers to present accurate error information
>>> that actually helps determine the root cause of a failure instead of the
>>> current silent failure with a confusing error message model.
>>> 
>>> I look forward to your suggestions, but I'm not hopeful. I've been thus far
>>> unable to determine a way to improve the current solution in a way that isn't
>>> just papering over one problem without solving the fundamental issue.
>> 
>> Donald's perspective here matches my own. 
> 
> I don't see the "the fundamental issue" that PEP470 tries to solve.
> The first para of the abstract says it wants to substitute the existing
> mechanism for registering external indexes with another one.  It doesn't
> say why.  And it doesn't say why this can't be done in a backward
> compatible manner which would be preferable (i hope we agree there).

The fundamental issue is that PyPI is really two things, an index and a
repository. Currently these two roles are blurred and that lack of distinction
causes problems for both end users and authors and those problems create a
certain animosity towards people not wanting to use PyPI as their repository.
To this aim end users should be aware when they are installing things from
a repository other than PyPI and they should also be aware when doing so
is unsafe on the wire.

PEP 438 solves this problem. End users opt in to using a repository other than
PyPI. However It is my belief that the pain of doing so has outweighed the
benefits of PEP 438. Thus PEP 470 attempts to "go back to the drawing board"
and questions the mechanism for hosting on an alternative repository all
together.

> 
> And because the PEP doesn't precisely say what "fundamental issue"
> it solves it's a bit hard to present an alternative.  If it's about
> focusing on "multi-repository operations" and simplifying installer UI
> it could be done with full backward compat:
> 
> - add PyPI maintainer UI to add external indexes along with a message

Ok, this is part of PEP 470 too.

> 
> - change pip to disallow crawling to an external index it finds
>  but rather present a message that you need to add the index 
>  manually to your installer invocation. (pip already finds external
>  crawl URLs and it can also find the "new" ones - no need for
>  any breakage).

I had thought of similar things, and my reasons for not using an <a href> and
instead using a meta tag and for removing the old URLs instead of just making
this in addition to is:

1. I don’t *want* users of older versions of pip/easy_install to implicitly be
   fetching these things, they should be able to opt in as well and indeed all
   the mechanisms exist in pip/easy_install for them to already do so. The only
   thing that doesn’t exist is the discovery mechanism.

2. This doesn’t actually prevent breakage, it just links the breakage to the
   version of pip/easy_install someone is using at the cost that people with
   older clients are implicitly fetching things, some of which may or may not
   be safe.

Overall I think the goal of not breaking things is a good one, however PyPI
isn’t a versioned thing where people can limit what version of things they run.
It’s important just from a maintenance aspect to be able to deprecate and
remove things over time. This will break things for people depending on those
things of course, so it’s always a balancing act about deciding *when* exactly
to remove something. I think that this is a good time to remove this particular
thing because the core functionality of it’s replacement has existed for a long
time, the actual use of the feature is quite low, and leaving it in presents an
issue with usability and security.

> 
> - tell all project maintainers which have "explicit file urls" 
>  that they need to move their release files to an offsite
>  own external index (or to pypi itself) within N months. 
>  Then disable the file urls (after examination of how many people
>  are effected) and remove related un-needed options in pip.

This is still breakage for people using an older version of pip/easy_install,
although a smaller set of things will break in this sense.

> 
> Of course, i leave out some details but overall think it's pretty much
> doable.  With this strategy, both old and new versions of pip wold work
> fine with the changed PyPI.  It also wouldn't introduce very complicated
> transition phases or communication steps.
> 
> I postpone other issues with respect to clarity and security of
> PEP/multi-repo operations to first get clarity on the backward compat
> issue and general strategy.
> 
> best,
> holger
> 
> P.S.: Nick, i think my rough draft above satisfies all of your points
> below, although they only partly relate to what we discuss in the PEP
> IMHO.
> 
> 
>> I'll be interested to hear alternative proposals, but they should aim
>> to address at least the following user experience expectations:
>> 
>> 1. Easily allow external hosting to "just work" when appropriately
>> configured at the system, user or virtual environment level (pip
>> already supports this at the user level, and will support it at the
>> system and environment level in the next version).
>> 
>> 2. Easily allow package authors to tell PyPI "my releases are hosted
>> <here>" and have that advertised in such a way that tools can clearly
>> communicate it to users, without silently introducing unexpected
>> dependencies on third party services.
>> 
>> 3. Eliminate any and all references to the confusing "verifiable
>> external" and "unverifiable external" distinction from the user
>> experience (both when installing and when releasing packages).
>> 
>> 4. The repository aspects of PyPI should become *just* the default
>> package hosting location (i.e. the only one that is treated as opt-out
>> rather than opt-in by most client tools in their default
>> configuration). Aside from that aspect, hosting on PyPI should not
>> otherwise provide an enhanced user experience over hosting your own
>> package repository.
>> 
>> 5. Do all of the above while providing default behaviour that is
>> secure against most attackers below the nation state adversary level.
>> 
>> In my view, the most debatable part of Donald's latest proposal would
>> be the handling of projects that don't get updated to properly
>> register an external URL before the link spidering support is removed
>> from the client applications. That aspect should arguably include a
>> step where the decision on whether or not to disable that support is
>> based on *looking at the numbers again* before turning the feature off
>> on the server, and perhaps also monitoring for user complaints for a
>> period after it is first turned off, before the feature is removed
>> from the clients.
>> 
>> Regards,
>> Nick.
>> 
>> -- 
>> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



More information about the Distutils-SIG mailing list