[Distutils] Malicious packages on PyPI

Sat Jun 3 01:02:40 EDT 2017

On 3 June 2017 at 03:43, Donald Stufft <donald at stufft.io> wrote:
> That would get us to the point we can start collecting data and storing it.
> The next step would be to start processing that data to implement a black
> list, which would require work to be done in both Warehouse and legacy PyPI.
> Warehouse you’d want to implement the thing that actually periodically
> processes the big query data to generate the block list, and then in both
> Warehouse and Legacy PyPI you’d want to implement the block list support in
> the upload/register routines.

Something I was wondering was whether it might make sense to store the
blocklist as a separate table in the database, and enforce it using a
PostgreSQL constraint trigger when a project name is used for the
first time (i.e. when there are no previously recorded releases using
that name).

You'd still want explicit support at least in the Warehouse frontend
to provide a better UX when someone attempts to register a blocked
name and to provide admins with the ability to override the
auto-generated blocklist settings, but actually *enforcing* the
blocklist would become the database's job.

Not an urgent concern (since it's only a hypothetical design question
until the blocklist generation is implemented), but I figured it was
worth mentioning as a potential way to avoid having to update the
legacy PyPI codebase with block list support.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia