[Distutils] [Python-Dev] accept the wheel PEPs 425, 426, 427

Ronald Oussoren ronaldoussoren at mac.com
Fri Oct 26 09:54:00 CEST 2012


On 24 Oct, 2012, at 14:48, Daniel Holth <dholth at gmail.com> wrote:

> On Wed, Oct 24, 2012 at 7:28 AM, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
>> 
>> On 18 Oct, 2012, at 19:29, Daniel Holth <dholth at gmail.com> wrote:
>> 
>>> I'd like to submit the Wheel PEPs 425 (filename metadata), 426
>>> (Metadata 1.3), and 427 (wheel itself) for acceptance. The format has
>>> been stable since May and we are preparing a patch to support it in
>>> pip, but we need to earn consensus before including it in the most
>>> widely used installer.
>> 
>> PEP 427:
>> 
>> * The installation section mentions that .py files should be compiled to .pyc/.pyo files, and that "Uninstallers should be smart enough to remove .pyc even if it is not mentioned in RECORD.".
>> 
>>   Wouldn't it be better to add the compiled files to the RECORD file? That would break the digital signature, but I'm not sure if verifying the signature post-installation is useful (or if it's even
>>   intended to work).
> 
> The trouble with mentioning .pyc files in RECORD is that someone can
> install Python 3.4, and suddenly you have additional .pyc files,
> approximately __pycache__/pyfile.cp34.pyc. So you should remove more
> than what you installed anyway.
> 
> You can't verify the signature post-installation. #!python and RECORD
> have been rewritten at this point.
> 
>> * Why is urlsafe_b64encode_nopad used to encode the hash in the record file, instead of the normal hex encoding that's directly supported by the hash module and system tools?
> 
> It's nice and small. The encoder is just
> base64.urlsafe_b64encode(digest).rstrip('=')

But is the size difference really important? The wheel file itself is compressed, and the additional amount of space needed on installation shouldn't be a problem.  The advantage of using hexdigest is that both the "classic" MD5 checksum and the new tagged checksums you propose then use the same encoding for the signature.

> 
>> * The way to specify the required public key in package requirements in ugly (it looks like an abuse of setuptools' extras mechanism). Is there really no nicer way to specify this?
>> 
>> * As was noted before there is no threat model for the signature feature, which makes it hard to evaluate if the feature.  In particular, what is the advantage of this over PGP signatures of wheels? (PyPI already supports detached signatures, and such signatures are used more widely in the OSS world)
>> 
>> * RECORD.p7s is not described at all. I'm assuming this is intented to be a X.509 signature of RECORD in pkcs7 format. Why PKCS7 and not PEM? The latter seems to be easier to work with.
> 
> I am very confused about the idea that
> not-downloading-the-archive-you-expected (pypi accounts getting
> hacked, man-in-the-middle attacks, simply using the wrong index) is an
> unrealistic threat.

You don't mention the threats you try to protect against in the PEP.   Users are still somewhat vulnerable to the attacks to mention when the download new software, they still have to get the public key somewhere. In the example of using a requirements.txt file with public keys I'd still have to get that file from somewhere and maybe that location was attacked.

> 
> It might help to think of the wheel signing scheme as a more powerful
> version of the current #md5=digest instead of comparing it to PGP or
> TLS. An md5 sum verifies the integrity of a single archive, the wheel
> signing key verifies the integrity of any number of archives. Like the
> archive digest, wheel just explains how to attach the signature to the
> archive. A system for [automatically] trusting any particular key
> could be built separately.
> 
> Wheel's signing scheme is similar to jarsigner. The big advantage over
> PGP is that they are attached and less likely to get lost. PyPI still
> supports detached signatures, even on wheel files, but they are
> unpopular. Wheel gives you an additional different option.

RPM uses embedded PGP signatures, and those are easy enough to use.    PGP signatures on PyPI require a PGP installation on the users machine, your scheme at least has the advantage of not needing additional software. 

> 
> Since the signature is over the unpacked contents, you can also change
> the compression algorithm in the zipfile or append another signature
> without invalidating the existing signature.
> 
> The simplified certificate model is inspired by SPKI/SDSI
> (http://world.std.com/~cme/html/spki.html), Convergence
> (http://convergence.io/) TACK (http://tack.io), and the general
> discussion about the brokenness of the certificate authority system.

I've added these to my reading list. I know just enough of crypto/signatures to be dangerous, which might explain why I worry about something that isn't old and used a lot.

> You get the raw public key without a claim that it represents anything
> or anyone.

Simularly to the CA system :-)

> 
> PKCS7 is the format that a US government user would be required to use
> with their smartcard-based system.
> 
> I like the packagename[algorithm=key] syntax even though it started as
> a hack. It fits into the existing pip requirements.txt syntax
> perfectly, unlike packagename[extra]#algorithm=key, and it reads like
> array indexing.

I don't like the extras hack, but don't have a better solution. 

Ronald


More information about the Distutils-SIG mailing list