[Catalog-sig] A modest proposal for securing PyPI with TUF

Trishank Karthik Kuppusamy tk47 at students.poly.edu
Thu Mar 14 01:11:04 CET 2013


On 03/13/2013 02:15 PM, Daniel Holth wrote:
>
> With all the different kinds of metadata, It's interesting to note
> that currently TUF seems to only be concerned with the available file
> names and their integrity. (Some of us will think of PEP 426
> "PKG-INFO" first when we hear the word metadata.)

Yes, you are right that the many different kinds of metadata in this 
discussion (TUF metadata, PyPI metadata) makes things a little confusing 
sometimes! :))

My understanding of PEP 426 is that the distribution metadata is 
specified by the developer with the setup.py script.

To take the running Django example, since the Django developers will 
sign everything under the Django role with their own keys that the D 
role will talk about, setup.py, as well as the generated "PKG-INFO", 
will be signed by the Django developers. This means that pip + TUF will 
be able to verify these distribution metadata indirectly via the source 
distribution package.

Does this answer your question?

> It looks like the D metadata lists all the filenames for Django, and
> then Django lists them again with hashes and signatures. Why all the
> lists? Does every Django release re-assert all the versions of Django
> that are available on the index?

Good observation. For D, you are talking about the "paths" attribute here:

https://updateframework.com/pypi/repository/metadata/targets/packages/source/D.txt

For Django, you are talking about the "targets" attribute here:

https://updateframework.com/pypi/repository/metadata/targets/packages/source/D/Django.txt

Why is "paths" in D listing all the "targets" that Django already talks 
about? Presently, this is because our target delegation tool 
(signercli.py) is being paranoid and making sure that D is explicitly 
delegating only targets matching these "paths".

However, the TUF specification allows for D to simply say, "I delegate 
any target whatsoever under Django", by settings "paths" to 
"packages/source/D/Django/**":

https://www.updateframework.com/browser/specs/tuf-spec.txt#L525

> How might I deal with producing the official source distribution
> myself and having a friend produce the official Windows build of a
> package?

There are a few solutions. You could have your friend produce the 
official Windows build for a package, and then you could sign it, 
implicitly trusting your friend but not publishing that trust.

A more secure solution would have you delegate that target to your friend.

> As an aside PyPI has been doubling in size every 1.5 - 2 years.

Exponential growth strikes again! We have anticipated this, and we have 
a few solutions to curb the growth of TUF metadata. Since TUF metadata 
is simply text, GZIP compression would go a long way. Alternatively, we 
could implement delta updates of TUF metadata.

The more difficult problem is how to ensure that target delegation 
structure scales with PyPI growth. A good design will keep this in mind 
and plan accordingly.

Speaking of which, it may be the case that our design document for 
integrating PyPI with TUF may not be terribly easy to understand. (After 
all, you do need to understand TUF first, but TUF is fairly easy once 
you understand its main ideas.) I plan to publish a friendlier document 
which introduce TUF at a very high-level and instead discuss more 
pragmatic issues (such as workflows).



More information about the Catalog-SIG mailing list