[Python-ideas] Re: Enhancing Zipapp

Andrew Barnert abarnert at yahoo.com
Wed Jan 8 13:08:42 EST 2020


On Jan 8, 2020, at 01:09, Abdur-Rahmaan Janhangeer <arj.python at gmail.com> wrote:
> 
> Using the wheel-included zip (A), we can generate another zip file (B) with
> the packages installed. That generated zip file is then executed.

But that generated zip B doesn’t have a trustable hash on it, so how can you execute it?

If you keep this all hidden inside the zipapp system, where malicious programs can’t find and modify the generated zips, then I suppose that’s fine. But at that point, why not just install the wheels inside zip A into an auto-generated only-for-zip-A venv cache directory or something, and then just run zip A as-is against that venv?

> Zip format A solves the problem of cross-platforming.
> Normal solutions upto now like use solution B where you can't share
> your zips across OSes. 

You can still only share zips across OSs if you bundle in a wheel for each extension library for every possible platform. For in-house deployments where you only care about two platforms (your dev boxes and your deployment cluster boxes), that’s fine, but for a publicly released app that’s supposed to work “everywhere”, you pretty much have to download and redistribute every wheel on PyPI for every dependency, which could make your app pretty big, and require pretty frequent updates, and it still only lets you run on systems that have wheels for all your dependencies.

If you’re already doing an effective “install” step in building zip B out of zip A, why not make that step just use a requirements file and download the dependencies from PyPI? You could still run zip B without being online, just not zip A.

Maybe you could optionally include wheels and they’d serve as a micro-repo sitting in front of PyPI, so when you’re dependencies are small you can distribute a version that works for 95% of your potential users without needing to do anything fancy but it still works for the other 5% if they can reach PyPI.

(But maybe it would be simpler to just use the zip B as a cache in the first place. If I download Spam.zipapp for Win64 3.9, that’s a popular enough platform that you probably have a zip B version ready to go and just ship me that, so it works immediately. Now, if I copy that file to my Mac instead of downloading it fresh, oops, wrong wheels, so it downloads the right ones off PyPI and builds a new zipapp for my platform—and it still runs, it just takes a bit longer the first time. I’m not sure this is a good idea, but I’m not sure trying to include every wheel for every platform is a good idea either…)

But there’s a bigger problem than just distribution. Some extension modules are only extension modules for speed, like numpy. But many are there to interface with C libraries. If my app depends on PortAudio, distributing the extension module as wheels is easy, but it doesn’t do any good unless you have the C library installed and configured on your system. Which you probably don’t if you’re on Windows or Mac. A package manager like Homebrew or Choco can take care of that by just making my app’s package depend on the PortAudio package (and maybe even conda can?), but I don’t see how zipapps with wheels in, or anything else self-contained, can. And if most such packages eventually migrate to binding from Python (using cffi or ctypes) rather than from C (using an extension module), that actually makes your problem harder rather than easier, because now you can’t even tell from outside the code that there are external dependencies; you can distribute a single zipapp that works everywhere, but only in the sense that it starts running and quickly fails with an exception for most users.




More information about the Python-list mailing list