Signing and Verifying Python Packages with PGP

When I first showed Pip, the Python package installer, to a coworker a few years ago his first reaction was that he didn’t think it was a good idea to directly run code he downloaded from the Internet as root without looking at it first. He’s got a point. Paul McMillan dedicated part of his PyCon talk to this subject.

Python package management vs. Linux package management

To illustrate the security concerns, it is good to contrast how Python modules are usually installed with how Apt or Yum do it for Linux distributions. Debian and Redhat distros usually pre-provision the PGP keys for their packages with the distribution. Provided you installed a legitimate Linux distribution, you get the right PGP keys and every package downloaded through Apt/Yum is PGP checked. This means that the package is signed using private key for that distribution and you can verify that the exact package was signed and has not been modified. The package manager checks this and warns you when it does not match.

Pip and Easy Install don’t do any of that. They download packages in plaintext (which would be fine if every package was PGP signed and checked) and they download the checksums of the package in plaintext. If you manually tell Pip to point to a PyPI repository over HTTPS (say crate.io), it does not check the certificate. If you are on an untrusted network, it would not be tough to simply intercept requests to PyPI, download the package, add malicious code to setup.py and recalculate the checksum before returning the new malicious package on to be downloaded.

I think the big users of Python like the Mozillas of the world run their own PyPI servers and only load a subset of packages into it. I’ve heard of other shops making RPMs or DEBs out of Python packages. That’s what I often do. It lets you leverage the infrastructure of your distribution and the signing and checking infrastructure is already there. However, if you don’t want to do that, you can always PGP sign and verify your packages which is what the rest of this post is about.

Verifying a package

There are relatively few packages on the cheeseshop (PyPI) that are PGP signed. For this example, I’ll use rpc4django, a package I release, and Gnu Privacy Guard (GPG), a PGP implementation. The PGP signature of the package (rpc4django-0.1.12.tar.gz.asc) can be downloaded along with the package (rpc4django-0.1.12.tar.gz). If you simply attempt to verify it, you’ll probably get a message like this:

This message lets you know that the signature was made using PGP at the given date, but without the public key there is no way to verify that this package has not been modified since the author (me) signed it. So the next step is to get the public key for the package:

If you hit “1”, you will import the key. Re-running the verify command will now properly verify the package:

The fact that ten different Python modules will probably be signed by ten different PGP keys is a problem and I’m not sure there’s a way to make that easier. In addition, my key is probably not in your web of trust; nobody who you trust has signed my public key. So when you verify the signature, you will probably also see a message like this.

This means that I need to get my key signed by more people and you need to expand your web of trust.

Signing a package

Signing a package is easy and it is done as part of the upload process to PyPI. This assumes you have PGP all setup already. I haven’t done this in about a month so I hope the command is right.

There are additional options like the correct key to sign the package, but the signing part is easy.

However, how many people actually verify the signature? Almost nobody. The package managers (Pip/EasyInstall) don’t and you probably just use one of them.

The future of Python packaging

So what can we do? I tried to work on this at the PythonSD meetup but I didn’t get very far partially because it is a tough problem and partly because there was more chatting than coding. As a concrete proposal, I think we need to get PGP verification into Pip and solve issue #425. This probably means making Python-gnupg a prerequisite for Pip (at least for PGP verification). Step two is to add certificate verification. Python3 already supports certificate checking through OpenSSL. Python2 might have to use something like the Requests library. Step three is to get a proper certificate on PyPI.

Edit: Updated command to upload signed package

Edit (January 2018): This 5 year old post is massively outdated. I recommend taking a look at the Python packaging and distributing docs which are much better now. The commands I typically run to distribute a package are:

8 thoughts on “Signing and Verifying Python Packages with PGP

  1. We were just talking about this at work today because one of our sysadmins brought up the issue of package trust. Obviously doing all of this manually right now would be a huge hassle, and even having to mirror all the packages we need is a fair bit of work compared to the convenience of pip/easy_install.

    What if a service such as crate.io or even PyPI itself signed the packages it had if the authors didn’t sign it themselves, and then the tools provided support for signatures when you did the installation. At least then that would prevent a MITM attack during your download from the mirror.

  2. I think if crate.io or pypi (the module repositories) signed the package it would give an illusion of trust. Having them sign the package would only allow you to verify that you downloaded the package from the module repository without a MITM. With proper certificate checking and working certificates, you could be sure of the same thing. This still doesn’t prevent a rogue malicious package though.

    Thus far my simple solution is this:

    Once proper certificate checking is in place, this will offer some level of protection.

  3. Hi,
    We indeed do something like that here with a mirror site, and we actually also rpm-ize the packages.

    However, it would make things a LOT easier if:

    – packages were always signed by the developer (indeed if they’re signed by the host, it doesn’t really help much)

    – pip would keep a database of installed packages and output things like the version number, package name, install date, pathnames of the files installed (ie, something slightly more similar to what rpm and friends do in that regard)

    This is required for trusting the package always come from the same source, and to be able to check which package/lib is installed at which version in any point of time.

    The last item that would be interesting, although this one is terribly hard, is a centralized way for package maintainers to hint that they’re doing a security update. it’s terribly hard because most maintainers don’t want to deal with any overhead (unlike distribution packages, where they understand the need for it)

  4. Pip actually does keep track of some useful things, but it doesn’t always expose them. It does keep track of installed files, package name and version in site-packages/*.egg-info. However, this isn’t accessible from the command line interface as far as I know.

    I’ve heard a number of notable Python developers express strong resistance to using system package managers for Python modules. For some Python projects, the system package manager is overkill (and it goes against the idea of virtualenvs), but there are a lot of benefits such as managing cross-language dependencies like headers for C extensions. Sadly, I think there’s going to be a lack of consensus for some time.

  5. awesome. ill track using that then. (it’s still be nice to have it in the pip tool directly)

    I know they like non-rpm because “you don’t need root” “its easy”. makes sense actually. from the management side however its the opposite as explained above.

  6. It looks like a new “show” command will be in the next pip release. I haven’t tested it out yet, but it looks like it will display details of installed packages. Here’s the changelog.

  7. There’s definitely been some movement on packaging security in the next couple weeks. It looks like pip issue #425 is going to get resolved but also pypi is going to get a proper certificate very shortly.

Comments are closed.