[Distutils] Towards a simple and standard sdist format that isn't intertwined with distutils

Daniel Holth dholth at gmail.com
Fri Oct 2 22:24:25 CEST 2015


We need to embrace partial solutions and the fine folks who propose them so
the whole packaging ecosystem can have some progress. PEP 438 may not be a
good analogue to adding a new sdist format since the latter only adds new
things that you can do. A new sdist format will inconvenience a much more
limited set of people, mainly the pip authors and the OS package
maintainers.

Sorry but the section of the PEP that prefixes filenames with _ has
distracted the discussion away from the general idea.

Instead of multiple hooks why not a single object exposed through an entry
point that has several optional methods?

NO

    [build]
    requirements = "flit"
    build-wheels = "flit.pypackage_hooks:build_wheels"
    build-in-place = "flit.pypackage_hooks:build_in_place"

YES

[build]
build-system=flit

class ABuildHook:
    def build_wheels(self): ...

entry_points = {'new.sdist.build.hooks': ['flit= some_module:ABuildHook']}


On Fri, Oct 2, 2015 at 3:48 PM Daniel Holth <dholth at gmail.com> wrote:

> The MEBS idea is inspired by heroku buildpacks where you just ask a list
> of tools whether they can build something.
> https://devcenter.heroku.com/articles/buildpacks . The idea would be that
> pip would use MEBS instead of its setup.py-focused builder. The first
> available MEBS plugin would notice setup.py and do what pip does now (force
> setuptools, build in a subprocess).
>
> You should know about flit https://github.com/takluyver/flit and Bento
> http://cournape.github.io/Bento/ which have their own lightweight
> metadata formats, which are transformed into standard Python formats by the
> respective tools.
>
> requires.txt is popular but I'm not a fan of it, it seems like it was
> invented by people who didn't want to have a proper setup.py for their
> project.
>
> We have to come up with something simpler than setup.py if we want to get
> some of the people who don't understand how to write setup.py. Ideally
> any required new user-editable "which build system" metadata could be
> boiled down to a single line in setup.cfg. There would be 3 stages: VCS
> checkout (minimal metadata, a "generate machine readable metadata" step
> equivalent to "setup.py egg_info") -> new sdist (PEP 376 style static
> metadata that can be trusted) -> wheel.
>
> (How pip builds a package from source: 1. download sdist; .egg-info
> directory is almost always present 2. run setup.py egg_info to get
> dependencies, because the static one is not reliable, because too many
> requirements lists have 'if' statements 3. compile)
>
> For all the talk about static metadata, the build script in general needs
> to remain a Turing-complete script. Build systems everywhere are programs
> to build other programs.
>
> I really like your idea about returning a list of built artifacts. Python
> packaging is strictly 1:1 source package -> output package but rpm, deb,
> can generate many packages from a single source package.
>
> I don't think we have to worry that much about Debian & RHEL. They will
> get over it if setup.py is no longer there. Change brings work but
> stagnation brings death.
>
> On Fri, Oct 2, 2015 at 2:41 PM Brett Cannon <brett at python.org> wrote:
>
>> On Fri, 2 Oct 2015 at 05:08 Donald Stufft <donald at stufft.io> wrote:
>>
>>> On October 2, 2015 at 12:54:03 AM, Nathaniel Smith (njs at pobox.com)
>>> wrote:
>>> > > We realized that actually as far as we could tell, it wouldn't
>>> > be that
>>> > hard at this point to clean up how sdists work so that it would be
>>> > possible to migrate away from distutils. So we wrote up a little
>>> > draft
>>> > proposal.
>>> >
>>> > The main question is, does this approach seem sound?
>>>
>>> I've just read over your proposal, but I've also just woken up so I
>>> might be
>>> a little slow still! After reading what you have, I don't think that this
>>> proposal is the right way to go about improving sdists.
>>>
>>> The first thing that immediately stood out to me, is that it's
>>> recommending
>>> that downstream redistributors like Debian, Fedora, etc utilize Wheels
>>> instead
>>> of the sdist to build their packages from. However, that is not really
>>> going to
>>> fly with most (all?) of the downstream redistributors. Debian for
>>> instance has
>>> policy that requires the use of building all of it's packages from
>>> Source, not
>>> from anything else and Wheels are not a source package. While it can
>>> theoretically work for pure python packages, it quickly devolves into a
>>> mess
>>> when you factor in packages that have any C code what so ever.
>>>
>>
>> So wouldn't they then download the sdist, build a wheel as an
>> intermediate, and then generate the .deb file? I mean as long as people
>> upload an sdist for those that want to build from source and a wheel for
>> convenience -- which is probably what most people providing wheels do
>> anyway -- then I don't see the problem.
>>
>>
>>>
>>> Overall, this feels more like a sidegrade than an upgrade. One major
>>> theme
>>> throughout of the PEP is that we're going to push to rely heavily on
>>> wheels as
>>> the primary format of installation. While that works well for things like
>>> Debian, I don't think it's going to work as wheel for us. If we were only
>>> distributing pure python packages, then yes absolutely, however given
>>> that we
>>> are not, we have to worry about ABI issues. Given that there is so many
>>> different environments that a particular package might be installed
>>> into, all
>>> with different ABIs we have to assume that installing from source is
>>> still
>>> going to be a primary path for end users to install and that we are
>>> never going
>>> to have a world where we can assume a Wheel in a repository.
>>>
>>> One of the problems with the current system, is that we have no
>>> mechanism by
>>> which to determine dependencies of a source distribution without
>>> downloading
>>> the file and executing some potentially untrusted code. This makes
>>> dependency
>>> resolution harder and much much slower than if we could read that
>>> information
>>> statically from a source distribution. This PEP doesn't offer anything
>>> in the
>>> way of solving this problem.
>>>
>>
>> Isn't that what the requirements and requirements-file fields in the
>> _pypackage file provide? Only if you use that requirements-dynamic would it
>> require execcuting arbitrary code to gather dependency information, or am I
>> missing something?
>>
>>
>>>
>>> To a similar tune, this PEP also doesn't make it possible to really get
>>> at
>>> any other metadata without executing software. This makes it pratically
>>> impossible to safely inspect an unknown or untrusted package to
>>> determine what
>>> it is and to get information about it. Right now PyPI relies on the
>>> uploading
>>> tool to send that information alongside of the file it is uploading, but
>>> honestly what it should be doing is extracting that information from
>>> within the
>>> file. This is sort of possible right now since distutils and setuptools
>>> both
>>> create a static metadata file within the source distribution, but we
>>> don't rely
>>> on that within PyPI because that information may or may not be accurate
>>> and may
>>> or may not exist. However the twine uploading tool *does* rely on that,
>>> and
>>> this PEP would break the ability for twine to upload a package without
>>> executing arbitrary code.
>>>
>>
>> Isn't that only if you use the dynamic fields?
>>
>>
>>>
>>> Overall, I don't think that this really solves most of the foundational
>>> problems with the current format. Largely it feels that what it achieves
>>> is
>>> shuffling around some logic (you need to create a hook that you
>>> reference from
>>> within a .cfg file instead of creating a setuptools extension or so) but
>>> without fixing most of the problems. The largest benefit I see to
>>> switching to
>>> this right now is that it would enable us to have build time
>>> dependencies that
>>> were controlled by pip rather than installed implicitly via the
>>> execution of
>>> the setup.py. That doesn't feel like a big enough benefit to me to do a
>>> mass
>>> shakeup of what we recommend and tell people to do. Having people adjust
>>> and
>>> change and do something new requires effort, and we need something to
>>> justify
>>> that effort to other people and I don't think that this PEP has
>>> something we
>>> can really use to justify that effort.
>>>
>>
>> From my naive perspective this proposal seems to help push forward a
>> decoupling of building using distutils/setuptools as the only way you can
>> properly build Python projects (which is what I think we are all after) and
>> will hopefully eventually free pip up to simply do orchestration.
>>
>>
>>>
>>> I *do* think that there is a core of some ideas here that are valuable,
>>> and in
>>> fact are similar to some ideas I've had. The main flaw I see here is
>>> that it
>>> doesn't really fix sdists, it takes a solution that would work for VCS
>>> checkouts and then reuses it for sdists. In my mind, the supported flow
>>> for
>>> package installation would be:
>>>
>>>     VCS/Bare Directory -> Source Distribution -> Wheel
>>>
>>> This would (eventually) be the only path that was supported for
>>> installation
>>> but you could "enter" the path at any stage. For example, if there is a
>>> Wheel
>>> already available, then you jump right on at the end and just install
>>> that, if
>>> there is a sdist available then pip first builds it into a wheel and then
>>> installs that, etc.
>>>
>>> I think your PEP is something like what the VCS/Bare Directory to sdist
>>> tooling
>>> could look like, but I don't think it's what the sdist to wheel path
>>> should
>>> look like.
>>>
>>
>> Is there another proposal I'm unaware for the sdist -> wheel step that is
>> build tool-agnostic? I'm all for going with the best solution but there has
>> to be an actual alternative to compare against and I don't know of any
>> others right now and this proposal does seem to move things forward in a
>> reasonable fashion.
>> _______________________________________________
>> Distutils-SIG maillist  -  Distutils-SIG at python.org
>> https://mail.python.org/mailman/listinfo/distutils-sig
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20151002/73569a1a/attachment-0001.html>


More information about the Distutils-SIG mailing list