[Distutils] A possible refactor/streamlining of PEP 517

Nick Coghlan ncoghlan at gmail.com
Sun Jul 16 02:27:37 EDT 2017


On 16 July 2017 at 14:56, Nathaniel Smith <njs at pobox.com> wrote:
> On Sat, Jul 15, 2017 at 8:50 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> On 16 July 2017 at 04:33, Donald Stufft <donald at stufft.io> wrote:
>>> All of that is a long winded way of saying I don’t particularly care if the
>>> VCS -> wheel -> install path is spelled out *always* doing in-place builds
>>> or if we add a build directory to specify between out of place or in place.
>>> Having a robust mechanism in place for doing that means we can adjust how
>>> things *typically* work without going back to the PEP process and throwing
>>> everything away.
>>
>> +1
>>
>> The thing I like about the latest draft of the API is that it lets
>> frontends choose freely between three build strategies:
>>
>> 1. build_sdist failing is a fatal error for the overall build
>> 2. ask build_wheel to do its best to emulate a "via sdist" build with
>> what's available
>> 3. ask build_wheel for a wheel without worrying too much about
>> matching the sdist
>
> I guess here you're identifying (2) with "out-of-place builds" and (3)
> with "in-place builds"?
>
> But... that is not what the in-place/out-of-place distinction means in
> normal usage, it's not the distinction that any of those build systems
> you were surveying implement, and it's not the distinction specified
> in the current PEP text.
>
> If what we want is a distinction between "please give me a correct
> wheel" and "please give me a wheel but I don't care if it's broken",
> then wouldn't it make more sense to have a simple flag saying *that*?

No, because pip *also* wants the ability to request that the backend
put the intermediate build artifacts in a particular place, *and*
having that ability will likely prove beneficial given directory based
caching schemes in build automation pipelines (with BitBucket
Pipelines and OpenShift Image Streams being the two I'm personally
familiar with, but it's a logical enough approach to speeding up build
pipelines that I'm sure there are others).

It just turns out that we can piggy back off that in-place/out-of-tree
distinction to *also* indicate how much the frontend cares about
consistency with sdist builds (which the PEP previously *didn't* say,
but explicit text along those lines was added as part of
https://github.com/python/peps/pull/310/files based on this latest
discussion).

> And in what case would a frontend ever set this
> give_me_a_correct_wheel flag to False?

When the frontend either genuinely doesn't care (hopefully rare, but
not inconceivable), or else when its building from an unpacked sdist
and hence can be confident that the artifacts will be consistent with
each other regardless of how the backend handles the situation
(expected to be very common, since it's the path that will be followed
for sdists published to PyPI, and when handed an arbitrary PEP 517
source tree to build, pip will likely try "build_sdist -> in-place
build_wheel" first and only fall back to "out-of-tree build_wheel" if
the initial build_sdist call fails).

This is the main reason piggy backing off the in-place/out-of-tree
distinction works so well for this purpose: if the frontend just
unpacked an sdist into a build directory, then the most obvious thing
for it to do is to do an in-place build in that directory.

It's only when the frontend is handed an arbitrary directory to build
that it doesn't know for sure is an unpacked sdist that the right
thing to do becomes markedly less clear, which is why we're offering
three options:

1. build_sdist -> unpack sdist -> in-place build_wheel (same as a PyPI download)
2. out-of-tree build_wheel (delegate the decision to the backend)
3. in-place build_wheel (explicitly decide not to worry about it)

We *think* 1 & 2 are going to be the most sensible options when given
an arbitrary directory, but allowing for 3 is a natural consequence of
supporting building from an unpacked sdist.

> In particular, I think we could then say that since for
> distutils/setuptools, MANIFEST.in affects the sdist, then this
> language means that their build_wheel hook MUST also be sensitive to
> MANIFEST.in, and therefore it would need to be implemented internally
> as sdist->unpack->bdist_wheel. (Or if someone's ambitious we could
> even optimize that internally by skipping the pack/unpack step, which
> should make Donald happy :-).) And OTOH other backends that don't do
> this odd MANIFEST.in thing wouldn't have to worry about this.

We don't want to get into the business of micromanaging how backends
work (that's how we got into the current mess with distutils &
setuptools), we just want to make it possible for frontend developers
and backend developers to collaborate effectively over time. That
turned out to be the core problem with the previous
"prepare_input_for_build_wheel" hook: while it technically addressed
Paul & Donald's concerns as frontend developers, it placed too many
arbitrary constraints on how backend implementations worked and didn't
align well with the way full-fledged build systems handle requests for
out of tree builds.

By contrast, `please do an out-of-tree build using this directory`
handles the same scenario (build_sdist failed, but build_wheel can
still be made to work) in a more elegant fashion by allowing the front
end to state *what* it wants (i.e. something that gets as close as is
practical to the "build_sdist -> unpack sdist -> in-place build_wheel"
path given the limitations of the current environment), while
delegating the precise details of *how* that is done to the backend.

Some backends will implement those requests literally as "build_sdist
-> unpack sdist -> in-place build_wheel" (which is what the example in
the PEP does).

Some won't offer any more artifact consistency guarantees than they do
for the in-place build_wheel case (this is the current plan for flit
and presumably for enscons as well)

Some will be able to take their sdist manifest data into account
without actually preparing a full sdist archive (this might make sense
for a setuptools/distutils backend).

All 3 of those options are fine from an ecosystem level perspective,
and are ultimately a matter to be resolved between backend developers
and their users.

So we're now at a point where:

- key frontend developers agree the current spec allows them to
request what they need/want from backends
- key backend developers agree the current spec can be readily implemented
- there are still some open questions around exactly when its
reasonable for hooks to fail, but we're only going to answer those
through real world experience, not further hypothetical speculation
- we have the ability to evolve the API in the future if some aspects
turn out to be particularly problematic

That means I'm going to *explicitly* ask that you accept that the PEP
is going to be accepted, and it's going to be accepted with the API in
its current form, even if you personally don't agree with our
reasoning for all of the technical details. If your level of concern
around the build_directory parameter specifically is high enough that
you don't want to be listed as a co-author on PEP 517 anymore, then
that's entirely reasonable (we can add a separate Acknowledgments
section to recognise your significant input to the process without
implying your endorsement of the final result), but as long as the
accepted API ends up being supported in at least pip, flit, and
enscons, it honestly doesn't really matter all that much in practice
what the rest of us think of the design (we're here as design
advisors, rather than being the ones that will necessarily need to
cope with the bug reports arising from any interoperability
challenges).

However, something you can definitely still influence is how the PEP
is *worded*, and how it explains its expectations to frontend and
backend developers - requests for clarification, rather than requests
for change. In particular, if you can figure out what the PEP would
have to say that it doesn't currently say for the design outcome to
seem logical to you, then I'd expect that to be a very helpful PR
(keeping in mind that https://github.com/python/peps/pull/311/files is
currently still open for review, and PR#310 was only merged recently).

Cheers,
Nick.

P.S. We're also going to have a subsequent update to the
specifications section of the Python Packaging User Guide, which will
likely initially just be a link to the PEP in a new subsection, but
will eventually involve being part of the expansion of that section
into a Python packaging interoperability reference guide:
https://github.com/pypa/python-packaging-user-guide/issues/319 )



-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Distutils-SIG mailing list