Python, Be Bold! - The Draft

Abdur-Rahmaan Janhangeer arj.python at gmail.com
Mon Jan 6 05:21:14 EST 2020


Note: Prepared a draft on the previous discussion, motivated by the vision
of
an era where the world swarms in Python apps. This draft is not a PEP, at
least
not yet. It's structure approaches a PEP but takes liberties as necessary.
It
includes info deemed as essential. Thanking list members for their input.

Abstract
======

This original proposal outlines 3 points to enhance app making in Python,
namely:

-- The formulation of a Python-specific single-file executable (Archive)
-- Better integration of the VM with the OS
-- Features to be supported by the native Gui library

This proposal aims at working to boost the dissemination of Python apps
and help further promote Python as the language of choice for developing
apps. It explores beyond performance, what can be improved? But since
the previous thread focused on the archive, we'll focus on it in this draft.

Aim Expansion
============

One area where there remains some difficulty in Python is packaging for
end-user consumption. To that effect either the code is distributed in pure
Python form with installers or native executables are built for each target
Os. Currently by default, Python does not provide such utilities. This pro-
posal aims at finalising a Python-specific archive as the default VM exec-
utable.

To further support the above, the proposals explores the impacts on the
interpreter: how should it be modified. Modifications explored don't
touch the parser in any way. Instead of discussing how to better integrate
the official dist with the System, this will discuss only archive-specific
mod-
ifications.

A push to choose Python might be providing better GUI options, so that
apps become more powerful, more beautiful and with even faster develop-
ment time. The proposal aimed to show the limitations of the current GUI
option  (tkinter) and what features can make app-making using default uti-
lities a class above. But, it is bulky enough to be discussed in a
different
draft.

In the light of the above, this draft's topic will be:

Python-specific Executable Archive.


Python-specific Executable Archive
===========================

Inspiration
--------------

Java has a file format called .jar which allows bundled programs to be run
with
a simple click.

Defining Executable
---------------------------

Before we begin, we'd like to define the term executable used in the context
of this draft. It means an archive that is run by double-clicking. This is
made
possible by file association. The closest example existent is the .pyz file
format.
Although for example .jar files are not native executables, they are
nonetheless
referred to in common language as executable as in question like these:
How to create an executable jar in java
<https://www.programmergate.com/create-executable-jar-java/>.

Closest Python Implementation
-----------------------------------------

Python already has an archive bundling module called zipapp which allow a
project to be bundled as a zip archive which has __main__.py as an point.

Advantages of zip-executables
-----------------------------------------

PEP441 <https://docs.python.org/3/library/zipapp.html> states one of the
advantage as: "These archives provide a great way to
publish software that needs to be distributed as a single file script but
is complex
enough to need to be written as a collection of modules."

Oracle Docs
<https://docs.oracle.com/javase/tutorial/deployment/jar/basicsindex.html>
state the advantage of archive executables in the following term:

"JAR files are packaged with the ZIP file format, so you can use them for
tasks
such as lossless data compression, archiving, decompression, and archive
unpacking.
These tasks are among the most common uses of JAR files, and you can
realize many JAR file benefits using only these basic features.

Even if you want to take advantage of advanced functionality provided by the
JAR file format such as electronic signing ..."

Other advantages include
- Portability
- Security
- Versioning
- Dependency freezing

which will be discussed when discussing specific examples.

How a .jar can help?
----------------------------

Both Java and Python share similarities in having a VM, bytecodes and being
labelled as cross-platform languages. Java has a .jar format for
distribution.
The .jar file gave rise to many formats such as the .apk


Brief
====

This proposal proposes to alter  zipapp with enhanced security and
versioning
among others. It can have a .pyz extention or choose another extention.


Existing solutions
==============

In this section we give an overview of existing file formats.

1) .Jar
---------

An official intro
<https://docs.oracle.com/javase/8/docs/technotes/guides/jar/jar.html> runs
like this: "JAR file is a file format based on the popular ZIP
file format and is used for aggregating many files into one. A  JAR file is
essentially a zip file that contains an optional META-INF directory. A JAR
file can be created by the command-line jar tool, or by using the
 java.util.jar API in the
Java platform. There is no restriction on the name of a JAR file, it can be
any
legal file name on a particular platform.

In many cases, JAR files are not just simple archives of java classes files
and/or resources. They are used as building blocks for applications and
extensions.
The META-INF directory, if it exists, is used to store package and
extension
configuration data, including security, versioning, extension and services."

Which follows that .jar is not only used for program bundling but also for
packaging modules.

- META-INF/MANIFEST.MF

The manifest file is a file containing meta information. A simple one
generated might look like this:

```
Manifest-Version: 1.0
Created-By: 11.0.3 (AdoptOpenJDK)
```
An advanced one add in many things such as author name, corporation name,
build tool name etc etc.

The idea of  a manifest file was further exploited by the .apk format

- Digital Signature

Modified from here
<https://docs.oracle.com/javase/tutorial/deployment/jar/intro.html>,
signing a Jar file means: "When the JAR file is signed, you
also have the option of time stamping the signature. Similar to putting a
date on
a paper document, time stamping the signature identifies when the JAR file
was
signed. The time stamp can be used to verify that the certificate used
to sign the JAR file was valid at the time of signing. ... If you have
downloaded
some code that's signed by a trusted entity, you can use that fact as a
criterion
in deciding which security permissions to assign to the code. ... The Java
platform enables signing and verification by using special numbers called
public
and private keys. Public keys and private keys come in pairs, and they play
complementary roles. The private key is the electronic "pen" with which you
can
sign a file. As its name implies, your private key is known only to you so
that no
one else can "forge" your signature. A file signed with your private key
can
be verified only by the corresponding public key. ... One more element,
therefore, is required to make signing and verification work. That
additional
element is the certificate that the signer includes in a signed JAR file. A
certificate is a digitally signed statement from a recognized certification
authority that indicates who owns a particular public key. Certification
authorities are
entities (typically firms specializing in digital security) that are
trusted throughout the industry to sign and issue certificates for keys and
their owners.
In the case of signed JAR files, the certificate indicates who owns the
public key
contained in the JAR file.

When you sign a JAR file your public key is placed inside the archive along
with an associated certificate so that it's easily available for use by
anyone wanting
to verify your signature.

To summarize digital signing:

-The signer signs the JAR file using a private key.
-The corresponding public key is placed in the JAR file, together with its
certificate, so that it is available
for use by anyone who wants to verify the signature."

- Signature Files

When you sign a JAR file, each file in the archive is given a digest entry
in the
archive's manifest.
Here's an example of what such an entry might look like:

Name: test/classes/ClassOne.class
SHA1-Digest: TD1GZt8G11dXY2p4olSZPc5Rj64=

The digest values are hashes or encoded representations of the contents of
the
files as they were at the time of signing. A file's digest will change if
and only if the file itself changes.

When a JAR file is signed, a signature file is automatically generated and
placed
in the JAR file's META-INF directory, the same directory that contains the
archive's manifest. Signature files have filenames with an .SF extension.
Here is
an example of the contents of a signature file:

Signature-Version: 1.0
SHA1-Digest-Manifest: h1yS+K9T7DyHtZrtI+LxvgqaMYM=
Created-By: 1.7.0_06 (Oracle Corporation)

Name: test/classes/ClassOne.class
SHA1-Digest: fcav7ShIG6i86xPepmitOVo4vWY=

Name: test/classes/ClassTwo.class
SHA1-Digest: xrQem9snnPhLySDiZyclMlsFdtM=

Name: test/images/ImageOne.gif
SHA1-Digest: kdHbE7kL9ZHLgK7akHttYV4XIa0=

Name: test/images/ImageTwo.gif
SHA1-Digest: mF0D5zpk68R4oaxEqoS9Q7nhm60=

- Signature Block File

In addition to the signature file, a signature block file is automatically
placed in
the META-INF directory when a JAR file is signed. Unlike the manifest file
or the
signature file, signature block files are not human-readable.
The signature block file contains two elements essential for verification:
The digital signature for the JAR file that was generated with the signer's
private key
The certificate containing the signer's public key, to be used by anyone
wanting
to verify the signed JAR file

Signature block filenames typically will have a .DSA extension indicating
that
they were created by the default Digital Signature Algorithm. Other
filename
extensions are possible if keys associated with some other standard
algorithm
are used for signing.

2) Android
--------------

The .apk is much like ,jar, it's whole content can be found here
<https://en.wikipedia.org/wiki/Android_application_package>. It's manifest
file
includes the interesting concept of permissions.

Android has .apk for apps and .arr for modules.

The developer's website states:
"An Android library is structurally the same as an Android app module. It
can
include everything needed to build an app, including source code, resource
files, and an Android manifest. However, instead of compiling into an APK
that
runs on a device, an Android library compiles into an Android Archive (AAR)
file
that you can use as a dependency for an Android app module. Unlike JAR
files,
AAR files can contain Android resources and a manifest file, which allows
you
to bundle in shared resources like layouts and drawables in addition to
Java
classes and methods.

A library module is useful in the following situations:

-When you're building multiple apps that use some of the same components,
such as activities, services, or UI layouts.
-When you're building an app that exists in multiple APK variations, such
as a
free and paid version and you need the same core components in both."

- Signing

Since we already included .jar signing in this mail, we'll include only
relevent
links. Android has 3 signing schemes, one based on.jar
<https://source.android.com/security/apksigning>, scheme v2
<https://source.android.com/security/apksigning/v2>, and v3
<https://source.android.com/security/apksigning/v3>

3rd Party Packages
===============

Before we go on to discuss recommended specifications for our solution,
we'll
discuss a bit about including 3rd party modules in the folder itself.

One of the strengths of Python is the extensive 3rd party modules available
and, the swift addition of libraries.

A normal python distribution includes 3rd party packages under
Lib/site-packages/

Virtual environments include 3rd party packages under
env-name/Lib/site-packages/

For the archive also, proposing to include 3rd party packages under a
site-packages folder

It can be created on bundling with packages included.


Dealing with 3rd party C-based packages
================================

The above 3rd party package dealing should have sufficed if Python always
executed pure-python codes. However Python provides an effective way of
speeding up apps by writing C-codes. It poses the problem of OS-specific
files.

Instead of including the libs themselves, we can include wheels

project/
    wheels/ # included upon building of archive
    site-packages/ # included when running app

Those are only ideas

Zipapp as a practical solution for 3rd party bundling
========================================

Here <https://gist.github.com/lukassup/cf289fdd39124d5394513a169206631c>
user lukassup provided the way of bundling and running a flask app via
zipapp. He did not install packages in a venv but rather in a separate
folder, which for our purposes can be named as site-packages.
The final structure looks like this:

├── app/
│   ├── click/
│   ├── flask/
│   ├── gunicorn/
│   ├── jinja2/
│   ├── markupsafe/
│   ├── werkzeug/
│   ├── itsdangerous.py
│   └── requirements.txt
├── flaskr/
│   ├── flaskr/
│   ├── flaskr.egg-info/
│   ├── venv/
│   ├── LICENSE
│   ├── MANIFEST.in
│   ├── README.rst
│   ├── requirements.txt
│   └── setup.py
└── venv/
    ├── bin/
    ├── include/
    ├── lib/
    └── pip-selfcheck.json


Entry points
=========

An entry point is important to know what file to launch on archive execution

for .jar it can be specified at the command line or in the manifest
Main-Class: MyPackage.MyClass

PyInstaller generates a .spec file based on your entry point file

Zipapp has a __main__.py file or you can set your entry point via command
line
$ python3 -m zipapp APP_DIR -m ENTRYPOINT_MODULE:ENTRYPOINT_FUNCTION -p
PYTHON_INTERPRETER

pynsist lets you specify an entry point in your installer.cfg file


Now after going through .jar, Android and zipapp, we can formulate some
specifications

Specifications for a Python-specific executable
====================================

- Manifest

A user manifest where the user includes relevent archiving info
A generated manifest with info to be included while archiving like signing
info

-  Sigining

To check whether file contents are the same as at the time of archiving

- Entry point

Which file to launch

- Bundling 3rd party packages

wheels inclusion seems to be the answer.

- Exec mode

Does temporary files offer advantages over in-memory execution? Can it be
an advantage for 3rd party packages?

This draft does not propose a fixed solution but expects proposals from the
community (like hash, which hash to use etc) or point point wrong
assumptions.

Yours,

Abdur-Rahmaan Janhangeer
pythonmembers.club | github
Mauritius


More information about the Python-list mailing list