[Distutils] Handling the binary dependency management problem

Nick Coghlan ncoghlan at gmail.com
Tue Dec 3 14:53:21 CET 2013


On 3 December 2013 22:49, Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
> On 3 December 2013 11:54, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> I believe helping the conda devs to get it to play nice with virtual
>> environments is still a worthwhile exercise though (even if just by
>> pointing out areas where it *doesn't* currently interoperate well, as
>> we've been doing in the last day or so), and if the conda
>> bootstrapping issue is fixed by publishing wheels (or vendoring
>> dependencies), then "try conda if there's no wheel" may still be a
>> reasonable fallback recommendation.
>
> Well for a start conda (at least according to my failed build)
> over-writes the virtualenv activate scripts with its own scripts that
> do something completely different and can't even be called with the
> same signature. So it looks to me as if there is no intention of
> virtualenv compatibility.

Historically there hadn't been much work in that direction, but I
think there's been some increasing awareness of the importance of
compatibility with the standard tools recently (I'm not certain, but
the acceptance of PEP 453 may have had some impact there).

I also consider Travis a friend, and have bent his ear over some of
the compatibility issues, as well as the fact that pip has to handle
additional usage scenarios that just aren't relevant to most of the
scientific community, but are critical for professional application
developers and system integrators :)

The recent addition of "conda init" (in order to reuse a venv or
virtualenv environment) was a big step in the right direction, and
there's an issue filed about activate getting clobbered:
https://github.com/ContinuumIO/conda/issues/374 (before conda init,
you couldn't really mix conda and virtualenv, so the fact they both
had activate scripts didn't matter. Now it does, since it affects the
usability of conda init)

> As for "try conda if there's no wheel" according to what I've read
> that seems to be what people who currently use conda do.
>
> I thought about another thing during the course of this thread. To
> what extent can Provides/Requires help out with the binary
> incompatibility problems? For example numpy really does provide
> multiple interfaces:
> 1) An importable Python module that can be used from Python code.
> 2) A C-API that can be used by compiled C-extensions.
> 3) BLAS/LAPACK libraries with a particular Fortran ABI to any other
> libraries in the same process.
>
> Perhaps the solution is that a build of a numpy wheel should clarify
> explicitly what it Provides at each level e.g.:
>
> Provides: numpy
> Provides: numpy-capi-v1
> Provides: numpy-openblas-g77
>
> Then a built wheel for scipy can Require the same things. Cristoph
> Gohlke could provide a numpy wheel with:
>
> Provides: numpy
> Provides: numpy-capi-v1
> Provides: numpy-intelmkl

Hmm, I likely wouldn't build it into the core requirement system (that
all operates at the distribution level), but the latest metadata
updates split out a bunch of the optional stuff to extensions (see
https://bitbucket.org/pypa/pypi-metadata-formats/src/default/standard-metadata-extensions.rst).
What we're really after at this point is the ability to *detect*
conflicts if somebody tries to install incompatible builds into the
same virtual environment (e.g. you installed from custom index server
originally, but later you forget and install from PyPI).

So perhaps we could have a "python.expects" extension, where we can
assert certain things about the metadata of other distributions in the
environment. So, say that numpy were to define a custom extension
where they can define the exported binary interfaces:

    "extensions": {
        "numpy.compatibility": {
            "api_version": 1,
            "fortran_abi": "openblas-g77"
        }
    }

And for the Gohlke rebuilds:

    "extensions": {
        "numpy.compatibility": {
            "api_version": 1,
            "fortran_abi": "intelmki"
        }
    }

Then another component might have in its metadata:

    "extensions": {
        "python.expects": {
            "numpy": {
                "extensions": {
                    "numpy.compatibility": {
                        "fortran_abi": "openblas-g77"
                    }
                }
            }
        }
    }

The above would be read as '"this distribution expects the numpy
distribution in this environment to publish the "numpy.compatibility"
extension in its metadata, with the "fortran_abi" field set to
"openblas-g77"'

If you attempted to install that component into an environment with
the intelmki FORTRAN ABI declared, it would fail, since the
expectation wouldn't match the reality.

> And his scipy wheel can require the same. This would mean that pip
> would understand the binary dependency problems during dependency
> resolution and could reject an incompatible wheel at install time as
> well as being able to find a compatible wheel automatically if one
> exists in the server. Unlike the hash-based dependencies we can see
> that it is possible to depend on the numpy C-API without necessarily
> depending on any particular BLAS/LAPACK library and Fortran compiler
> combination.

I like the general idea of being able to detect conflicts through the
published metadata, but would like to use the extension mechanism to
avoid name conflicts.

> The confusing part would be that then a built wheel doesn't Provide
> the same thing as the corresponding sdist. How would anyone know what
> would be Provided by an sdist without first building it into a wheel?
> Would there need to be a way for pip to tell the sdist what pip wants
> it to Provide when building it?

I think that's a separate (harder) problem, but one that the
expectation approach potentially solves, since we'd just have to
provide a list of expectations for a distribution to the build process
(and individual distributions would have full control over defining
what expectations will influence the build process, most likely
through custom extensions).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Distutils-SIG mailing list