[Distutils] Handling Case/Normalization Differences

Donald Stufft donald at stufft.io
Fri Aug 29 00:41:05 CEST 2014


Since pip 1.4 it does yes, however the problem here is that typically bandersnatch
mirrors are simply hosted by plain static web servers and don’t require any sort of runtime logic.

> On Aug 28, 2014, at 6:39 PM, Joe Smith <yasumoto7 at gmail.com> wrote:
> 
> Naive question- does pip send over a UserAgent (or something) that contains a version number the server can use to determine which behavior to default to?
> 
> That would allow a deprecation cycle of N months or so that will let people upgrade from 1.5 to 1.6. We could then watch usage of 1.5 decrease over time until it's a non-factor.
> 
> 
> On Thu, Aug 28, 2014 at 3:26 PM, Donald Stufft <donald at stufft.io <mailto:donald at stufft.io>> wrote:
> 
>> On Aug 28, 2014, at 6:09 PM, Donald Stufft <donald at stufft.io <mailto:donald at stufft.io>> wrote:
>> 
>> 
>>> On Aug 28, 2014, at 2:58 PM, Donald Stufft <donald at stufft.io <mailto:donald at stufft.io>> wrote:
>>> 
>>> Right now the “canonical” page for a particular project on PyPI is whatever the
>>> author happened to name their package (e.g. Django). This requires PyPI to have
>>> some "smarts" so that it can redirect things like /simple/django/ to
>>> /simple/Django/ otherwise someone doing ``pip install django`` would fall back
>>> to a much worse behavior.
>>> 
>>> If this redirect doesn't happen, then pip will issue a request for just
>>> /simple/ and look for a link that, when both sides are normalized, compares
>>> equal to the name it's looking for. It will then follow the link, get
>>> /simple/Django/ and everything works... Except it doesn't. The problem here
>>> comes from the external link classification that we have now. Pip sees the
>>> link to /simple/Django/ as an external link (because it lacks the required
>>> rels) and the installation finally fails.
>>> 
>>> The /simple/ case rarely happens when installing from PyPI itself because of
>>> the redirect, however it happens quite often when someone is attempting to
>>> instal from a mirror instead. Even when everything works correctly the penality
>>> for not knowing exactly what name to type in results in at least 1 extra http
>>> request, one of which (/simple/) requires pulling down a 2.1MB file.
>>> 
>>> To fix this I'm going to modify PyPI so that it uses the normalized name in
>>> the /simple/ URL and redirects everything else to the non-normalized name. I'm
>>> also going to submit a PR to bandersnatch so that it will use normalized names
>>> for it's directories and such as well. These two changes will make it so that
>>> the client side will know ahead of time exactly what form the server expects
>>> any given name to be in. This will allow a change in pip to happen which
>>> will pre-normalize all names which will make the interaction with mirrors better
>>> and will reduce the number of HTTP requests that a single ``pip install`` needs
>>> to make.
>>> 
>>> ---
>>> Donald Stufft
>>> PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
>>> 
>>> _______________________________________________
>>> Distutils-SIG maillist  -  Distutils-SIG at python.org <mailto:Distutils-SIG at python.org>
>>> https://mail.python.org/mailman/listinfo/distutils-sig <https://mail.python.org/mailman/listinfo/distutils-sig>
>> 
>> 
>> Hm, so here’s the problem.
>> 
>> I have this implemented and deployed to TestPyPI, it works great!
>> 
>> However, the next step is to make the change to bandersnatch so that it saves
>> things using their normalized name instead of using their "proper" name. Doing
>> this will trigger it so that everyone using pip 1.5 won't be able to install
>> anything from that mirror unless it's name is specified as the normalized name
>> (e.g. ``pip install Django`` will fail without --allow-unverified but
>> ``pip install django`` will work). This would be fixed with pip 1.6 (since
>> it would know to "normalize" the name before fetching the URL).
>> 
>> The same thing will occur if we make the change in pip first, it would
>> normalize names so you'd need to use --allow-unverified for everything because
>> it would act as if you typed ``pip install django`` instead of ``pip install
>> Django``.
>> 
>> To my knowledge, this *only* will affect pip 1.5.x.
>> 
>> So the only way forward I can see to make this change, which I think is a good
>> change and will remove a big "gotcha" from using a mirror, is to coordinate
>> a release of bandersnatch that coincides with pip 1.6, and tell people they
>> need to upgrade in lockstep.
>> 
>> Does anyone have any other ideas?
>> 
>> ---
>> Donald Stufft
>> PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
>> 
>> _______________________________________________
>> Distutils-SIG maillist  -  Distutils-SIG at python.org <mailto:Distutils-SIG at python.org>
>> https://mail.python.org/mailman/listinfo/distutils-sig <https://mail.python.org/mailman/listinfo/distutils-sig>
> 
> 
> Just thought of this, if the normalized name doesn’t match the "real" name,
> then add entries for both. This will make it so that pip 1.5 continues to work
> and pip 1.6+.
> 
> ---
> Donald Stufft
> PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
> 
> 
> _______________________________________________
> Distutils-SIG maillist  -  Distutils-SIG at python.org <mailto:Distutils-SIG at python.org>
> https://mail.python.org/mailman/listinfo/distutils-sig <https://mail.python.org/mailman/listinfo/distutils-sig>
> 
> 

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20140828/b7747c81/attachment.html>


More information about the Distutils-SIG mailing list