[Catalog-sig] Use user-specific site-packages by default?

Holger Krekel holger.krekel at gmail.com
Tue Feb 5 15:06:05 CET 2013


On Tue, Feb 5, 2013 at 2:13 PM, Lennart Regebro <regebro at gmail.com> wrote:

> On Tue, Feb 5, 2013 at 2:02 PM, Holger Krekel <holger.krekel at gmail.com>
> wrote:
> > Dropping the crawling over external pages needs _much_ more than just a
> few
> > months deprecation warnings, rather years.   There are many packages out
> > there, and it would break people's installations.
>
> No it won't. Nothing gets uninstalled. What stops working is
> installing those packages with pip/easy_install. And that will start
> again as soon as the maintainer uploads the last version to PyPI,
> which she/he is likely to do quite quickly after people start
> complaining.
>
>
I wouldn't assume that maintainers are easily reachable.  I've contacted at
least three people of different (>1K downloads) packages which never
responded.

And of course, i didn't mean to imply that already installed packages would
suddenly break. Rather that installation instructions like "use pip install
X" will just fail with some dependency "Y" not getting installed.  Or
getting installed in some random lower version which might contain evil
bugs (including security bugs).   For exmaple, the referenced "lockfile"
project has a "0.2" release on pypi, but is currently at 0.9.


> > I certainly agree, though, that the current client-side crawling is a
> > nuisance and makes for unreliability of installation procedures.  I
> think we
> > should move the crawling to the server side and cache packages.
>
> That will mean that a man in the middle-attack might poison PyPI's
> cache. I don't think that's a feasible path forward.
>
>
Like i said (you snipped that part of the mail), it's a matter of policy.
Externally available packages could be downloaded at once, and not on
demand.   Such a download and checksumming could be repeated over a period
of time and from different machines.  Of course a remotely stored package
could already be compromised - but such a possibility always exists (even
if an author signs a package with PGP - his machine might be infiltrated,
or the Jenkins build systems performing automated releases etc.).

Packages does not need to be "cached", as they are not supposed to
> change. If you change the package you should really release a new
> version. (Unless you made a mistake and discovered it before anyone
> actually downloaded it). So what you are proposing is really that PyPI
> downloads the package from an untrusted source, if the maintainer
> doesn't upload it. I prefer that we demand that the maintainer upload
> it.
>
>
I actually think it might make sense to forbid referencing external files
for _future_ pypi uploads (except "#egg=" references probably).   The
maintainer trying to do that, then gets a clear error and instructions how
to proceed.   She is just trying to get something out, so we have her
attention.

Changing pip/distribute-easy_install defaults to require an option for
installing packages coming from link rel-types of "download" or "homepage"
might make sense as well.

In the end, however, none of this prevents MITM attacks between a
downloader and pypi.python.org.  Or between the uploader and
pypi.python.org(using basic auth over http often).  Signing methods
like
https://wiki.archlinux.org/index.php/Pacman-key are key.  If a signature is
available (also at a download_url site), then we can exclude undetected
tampering.  And there might not be a need to break currently working
package releases.

It certainly makes sense to fortify python packaging and installation
procedures, but i'd like a bit more of a systematic view on it, including
reviews from security-focused people and a somewhat incremental verified
approach to turn it real and used.

best,
holger



//Lennart
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20130205/894c5f2f/attachment-0001.html>


More information about the Catalog-SIG mailing list