[Distutils] Using Wheel with zipimport

Vinay Sajip vinay_sajip at yahoo.co.uk
Wed Jan 29 14:59:33 CET 2014


Nick Coghlan <ncoghlan <at> gmail.com> writes:

> I believe Paul's concern is with anything that suggests that arbitrary
> *third party* code can be run from wheel files, when the reality is
> that it is fairly easy to accidentally write code that assumes it is
> installed on the filesystem in a way that isn't easy for a quick scan
> of the files in the zip archive to detect (especially since the PEP
> 376 installation database PEP doesn't include any support for
> arbitrary metapath importers).

That is a valid concern, but no one is suggesting that arbitrary third
party code can run from wheel files, just as zipimport makes no guarantees
about zipped code working.

> By contrast PEP 441 is a *distribution* utility - the creator of the
> application is expected to ensure that doing so actually works
> correctly before publishing their app that way, just as we would
> expect py2exe, py2app and cx-freeze users to do.

True, but there's no reason why wheels couldn't have some metadata
indicating that this diligence has been exercised by the wheel creator.
 
> With the "reference implementation" position that distlib is likely to
> occupy in a post-PEP-426/440/459 world, though, there's an additional
> legitimate concern about allowing end users to easily distinguish
> between "this API is fully supported by the PyPA as part of the
> reference implementation for metadata 2.0" and "this is an
> experimental packaging related API that may or may not be useful in
> general, and some members of the PyPA may still have grave
> reservations about it".
>
> At the moment, distlib contains both kinds of API, and it confuses
> *us*, let alone anyone else that isn't closely following along on
> distutils-sig. As long as distlib is serving the dual role of
> providing both "the reference implementation for metadata 2.0" and
> "some experimental packaging related APIs", we're going to get
> concerns like this one arising. If there was a clear way to
> distinguish them (ideally with a separate project for either the
> reference implementation or the experimental stuff, but even a
> distinct namespace within the distlib project would help a great
> deal), I suspect there would be less concern.

These are social concerns perhaps more than technical concerns, and to
me they lack specificity. Of course some of the APIs in distlib are
new and untried-except-by-me, but the way to allay concerns is to focus
on specifics, force out the details of the concerns and then see how best
they can be addressed. This is not doable with "zipped-eggs-were-bad"
rhetoric. Details generally help to identify what the real problem is. For
example, Donald raised the spectre of security vulnerabilities with his
mention of Mitre and CVEs, but there were no specifics beyond that.
I found a discussion where someone had set PYTHON_EGG_CACHE to /tmp. I can 
certainly see the negative security implications of that, but the finger
was pointed at the using applications rather than setuptools. Even
though setuptools specifically added code as a remedy to warn when the
env var pointed to a world-writeable directory, this was seen as trying to
be helpful rather than patching a vulnerability. Of course, if I've
misunderstood something in that discussion or missed some other
security issue, then some pointers would help move the discussion along.

> In the specific case of distlib.mount, if it's eventually combined
> with a metadata extension like "distlib.mount" which packages must
> export in order for the command to allow them to be automatically used
> that way, then I don't see anything wrong with it *in general* - it's
> a natural extension of the setuptools "zip_safe" flag, but with the
> ability to include additional details (like whether or not there are C
> extensions that need to be automatically extracted).

Are you talking just about adding wheels to sys.path, or do you mean
the extension-extraction stuff? Note that distlib's Wheel.mount does a
compatibility check and addition to sys.path, which I feel is not
especially controversial and better than just adding to sys.path,
which user code can now do, anyway. But nothing else happens, unless
specific metadata is provided in the wheel to enable it. While it's not
specifically a "distlib.mount" export, there is a facility to ask for
extensions to be extracted, and the in absence of metadata asking for this,
no extraction occurs.

> goes further than the current EXTENSIONS approach - this proposal
> would be akin to *requiring* an empty EXTENSIONS file, and/or the
> setuptools zip_safe flag in order to allow mounting of even the pure
> Python wheel. Such a conservative approach is also the antithesis of
> the setuptools "attempt to guess": if the package publisher doesn't
> explicitly opt in to zip support, then distlib.mount would assume that
> it is *not* supported (but may provide an API for the caller to
> override that, like "assume_zip_safe=True" or "force=True").

I have no problem with adding wheel metadata to allow/disallow even
adding to sys.path - it's effectively just like another step in the
compatibility check. It would make most sense to place this in the
WHEEL metadata, rather than pydist.json or similar, since it relates to
the contents of a particular wheel rather than the distribution in
general.

> However, like Paul, I have some concerns about a still experimental
> API like that being in the metadata 2.0 reference implementation,
> since that will likely end up having to deal with stdlib-like levels
> of backwards compatibility requirements, and removing experimental
> APIs that we later decided we weren't happy with could prove
> problematic.

But we're talking about the Python 3.5 time-frame here, and 3.4 isn't even
out yet. ISTM there is plenty of time to get these sorts of issues ironed
out. While I tend to favour backward compatibility wherever possible,
distlib is nowhere near 1.0, and so distlib users (a small number, from
what I can see) could expect some API breakage if there's no sensible
alternative.

Regards,

Vinay Sajip



More information about the Distutils-SIG mailing list