[Distutils] Handling the binary dependency management problem

Nick Coghlan ncoghlan at gmail.com
Mon Dec 2 08:31:09 CET 2013


On 2 December 2013 09:38, Paul Moore <p.f.moore at gmail.com> wrote:
> On 1 December 2013 22:17, Nick Coghlan <ncoghlan at gmail.com> wrote:
>
>> For example, I installed Nikola into a virtualenv last night. That required
>> installing the development headers for libxml2 and libxslt, but the error
>> that tells you that is a C compiler one.
>>
>> I've been a C programmer longer than I have been a Python one, but I still
>> had to resort to Google to try to figure out what dev libraries I needed.
>
> But that's a *build* issue, surely? How does that relate to installing
> Nikola from a set of binary wheels?

Because libxml2 and libxslt aren't Python programs - they're external
shared libraries. You would have to build statically for a wheel to
work (which is, to be fair, exactly what we do in many cases - CPython
itself bundles all its dependencies on Windows)

> I understand you are thinking about non-Python libraries, but all I
> can say is that this has *never* been an issue to my knowledge in the
> Windows world. People either ship DLLs with the Python extension, or
> build statically. I understand that things are different in the Unix
> world, but to be blunt why should Windows users care?
>
>> Outside the scientific space, crypto libraries are also notoriously hard to
>> build, as are game engines and GUI toolkits. (I guess database bindings
>> could also be a problem in some cases)
>
> Build issues again...

Yes, that's the point. wheels can solve the problem for cases where
all external dependencies can be statically linked, or otherwise
bundled inside the wheel. They *don't* work if you need multiple
Python projects binding to the *same* copy of the external dependency.

>> We have the option to leave handling the arbitrary binary dependency problem
>> to platforms, and I think we should take it.
>
> Again, can we please be clear here? On Windows, there is no issue that
> I am aware of. Wheels solve the binary distribution issue fine in that
> environment (I know this is true, I've been using wheels for months
> now - sure there may be specialist areas that need some further work
> because they haven't had as much use yet, but that's details)

Wheels work fine if they're self contained (whether through static
linking or bundling), or only depending on other projects within the
PyPI ecosystem, or if you're building custom wheels from source that
only need to work in an environment you control. They *don't* work as
soon as multiple components need to share a common external binary
dependency that isn't part of the PyPI ecosystem, and you want to
share your wheels with someone else - at that point you need to have a
mechanism for providing the external dependencies as well.

It's that second problem that some members of the scientific community
have solved through conda, and that's the one I am saying we should
postpone trying to solve at the pip level indefinitely (because it's
damn hard, and we don't need to - the people that care about that
feature generally won't care about the system integration features
that pip offers over conda).

>> This is why I suspect there will be a better near term effort/reward
>> trade-off in helping the conda folks improve the usability of their platform
>> than there is in trying to expand the wheel format to cover arbitrary binary
>> dependencies.
>
> Excuse me if I'm feeling a bit negative towards this announcement.
> I've spent many months working on, and promoting, the wheel + pip
> solution, to the point where it is now part of Python 3.4. And now
> you're saying that you expect us to abandon that effort and work on
> conda instead?

No, conda doesn't work for most of our use cases - it only works for
the "I want this pre-integrated stack of software on this system, and
I don't care about the details" use case.

However, this is an area where pip/virtualenv can fall short because
of the external shared binary dependency problem, and because there is
a lot of bad software out there with appallingly unreliable build
systems. If things are statically linked or bundled instead, then they
fall into the scope that I agree pip/virtualenv *should* be able to
handle (i.e. wheels that are self-contained assigned from the
dependencies declared in their metadata).

It's just the "arbitrary external shared binary dependency" problem
that I want to put into the "will not solve" bucket. What that means
is that remotely built wheels would *not* try to use system libraries,
they would always use static linking and/or bundling for external
binary dependencies. If you're already doing that on Windows, then
*great*, that's working sensibly within the constraints of the design.

Anyone that wanted dynamic linking of external shared dependencies
would then need to either build from source (for integration with
their local environment), or use a third party pre-integrated stack
(like conda).

> I never saw wheel as a pure-Python solution, installs
> from source were fine for me in that area. The only reason I worked so
> hard on wheel was to solve the Windows binary distribution issue. If
> the new message is that people should not distribute wheels for (for
> example) lxml, pyyaml, pymzq, numpy, scipy, pandas, gmpy, and pyside
> (to name a few that I use in wheel format relatively often) then
> effectively the work I've put in has been wasted.

If a distribution can be sensibly published as a self-contained wheel
with no external binary dependencies, then great, it makes sense to do
that (e.g. an lxml wheel containing or statically linked to libxml2
and libxslt).

The only problem I want to take off the table is the one where
multiple wheel files try to share a dynamically linked external binary
dependency.

> I'm hoping I've misunderstood here. Please clarify. Preferably with
> specifics for Windows (as "conda is a known stable platform" simply
> isn't true for me...) - I accept you're not a Windows user, so a
> pointer to already-existing documentation is fine (I couldn't find any
> myself).

Windows already has a culture of bundling all its dependencies, so
what I'm suggesting isn't the least bit radical there. It *is* radical
in other environments, as is the suggestion that the bundling
philosophy of the CPython Windows installer be extended to all Windows
targeted wheel files (at least for folks that mostly work on Linux).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Distutils-SIG mailing list