(yet another) PEP idea to tackle binary wheels problem efficiently

Alexander Revin lyssdod at gmail.com
Sat Feb 16 15:29:03 EST 2019


Hi all,

I've been thoroughly reading various discussions, such as [1], [2] and
related ones regarding PEP 425, PEP 491 and PEP 513. I also happen to
use musl sometimes, so as discussed here [3] I thought it would be a
good idea to submit a new PEP regarding musl compatibility.

It's not a secret that everyday wheel usage on Linux is far from
perfect. Some packages are trying to compile when there's no compiler
on the system available, some rely on 3rd-party deps and explode when
they cannot find required headers installed and so on. Add to this
cross-compilation support (quite complicated) and distros like Alpine
or just something not using x86 (how many piwheels.org-like should
emerge for each non-x86 platform?). For example, I would like to
seamlessly install numpy, pandas and scikit onto ppc machine running
Gentoo musl and not waste 1 hour for compilation, or "just" use them
in x86 standard alpine-based docker image (basically what [3] is all
about).

Long story short, current wheel filename format:

{distribution}-{version}(-{build tag})?-{python tag}-{abi
tag}-{platform tag}.whl.

Doesn't not properly express package expectation. Let's take a look at
pandas wheels ([4]):

pandas-0.24.1-cp36-cp36m-manylinux1_x86_64.whl
pandas-0.24.1-cp36-cp36m-win_amd64.whl
pandas-0.24.1-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl

I see issues with each of them:
1. First one won't work on Alpine or any musl-based distro;
2. Second – how amd64 is different from x86_64?
3. Third's platform tag is just a nightmare.

More of that, if you open the last one and inspect one of the
libraries, you'll find that:
$ file _libs/algos.cpython-36m-darwin.so
_libs/algos.cpython-36m-darwin.so: Mach-O universal binary with 2
architectures: [i386:Mach-O bundle i386] [x86_64]
_libs/algos.cpython-36m-darwin.so (for architecture i386): Mach-O bundle i386
_libs/algos.cpython-36m-darwin.so (for architecture x86_64): Mach-O
64-bit bundle x86_64

It's universal library! So not x86_64 only, as mentioned in the quite
long macosx_10_various platform tag.

TL;DR What my solution? To use something widely called "Target
Triplet" [5], omitting usual "vendor" field, so
{platform tag} from PEP 435 will have the format of <arch>_<os>[_<env>]:

pandas-0.24.1-cp36-cp36m-x86_64_linux_gnu.whl
pandas-0.24.1-cp36-cp36m-x86_64_linux_musl.whl
pandas-0.24.1-cp36-cp36m-x86_windows.whl
pandas-0.24.1-cp36-cp36m-x86_64_windows_msvs2010.whl
pandas-0.24.1-cp36-cp36m-x86_macosx_10_6.whl <-- won't be used for
anything Mojave+, see [6]
pandas-0.24.1-cp36-cp36m_aarch64_netbsd.whl

Primary concerns here:
- Arch and os are specified more consistently;
- Environment is specified only when needed;
- Lots of possible combinations of os and env are possible :)

Since most runtimes are hardcoded during build time anyway and changed
for each Python release, explicit versioning shouldn't be a problem.

JavaScript and Rustlang [7] use similar naming scheme. Though I don't
like both of them, at least portability of extensions looks less
broken than of Python (I've worked on native Nodejs extension in the
past).


What do you think?

Thanks,
Alex

[1] https://mail.python.org/archives/list/distutils-sig@python.org/thread/KCLRIN4PTUGZLLL7GOUM23S46ZZ2D4FU/
[2] https://github.com/pypa/packaging-problems/issues/69
[3] https://github.com/pypa/manylinux/issues/37
[4] https://pypi.org/project/pandas/#files
[5] https://wiki.osdev.org/Target_Triplet
[6] https://support.apple.com/en-us/HT208436
[7] https://doc.rust-lang.org/reference/conditional-compilation.html



More information about the Python-list mailing list