Packaging Python Libraries with PyPI20 Jan 2016
Package deployment is something that is so completely necessary, but such a pain in the butt that I avoid it a little bit. However to reuse code in Python and to do awesome things like
pip install mycode, you need to package it up and stick it on to PyPI (pronounced /pīˈpēˈī/ according to one site I read, though I still prefer /pīˈpī/). This process should be easy, but it’s detail oriented and there are only two good walk throughs (see links below).
The Python Package Index or PyPI is the official third-party software repository for the Python programming language. Python developers intend it to be a comprehensive catalog of all open source Python packages. — Wikipedia
I’ve outlined my process for publishing libraries to PyPI in this post. It is mostly for my own future reference, but I am writing an upcoming post about publishing data projects to PyPI on District Data Labs.
Before you can publish a package to PyPI, you need to make sure that you’re doing Python right. Mostly this means to ensure that you’ve structured your Python package according to the guide: How to Develop Quality Python Code. You should also have several files already part of your project, see Basic Python Project Files for those.
However, there are some things you probably haven’t done yet, so here is my checklist of stuff to take care of:
- There are at least some tests and continuous integration with Travis-CI
- There is documentation and it's hosted at Read the Docs
- You have a versioning scheme setup (see below) and the development branching model setup on GitHub.
You must create accounts on both the PyPI Test and PyPI Live sites in order to upload code. So do that now and log in to your PyPI account. Once you’ve done that, create a
.pypirc configuration file with your account information. Mine looks like this:
[distutils] index-servers = pypi pypitest [pypi] repository = https://pypi.python.org/pypi username = bbengfort password = theeaglefliesatmidnight [pypitest] repository = https://testpypi.python.org/pypi username = bbengfort password = shadowofthedawnawaits
Make sure this file is in your home directory; whenever you work with pip or a setup.py file, it will use this configuration for interactions with the remote index servers. As a side note, you can also build your own internal index servers using S3 or other tools!
Ok for the purposes of this post, we’re going to assume that we’re working on a library called “foo” and that the directory structure looks like this:
$ project . ├── .gitignore ├── .travis.yml ├── DESCRIPTION.rst ├── LICENSE.txt ├── Makefile ├── MANIFEST.in ├── mkdocs.yml ├── README.md ├── requirements.txt ├── setup.py ├── setup.cfg ├── bin | └── app.py ├── docs | ├── images | | └── banner.jpg | └── index.md ├── fixtures ├── foo | ├── __init__.py | └── version.py └── tests └── __init__.py
Honestly, I hate that these repos grow to such massive sizes, but honestly, this is a minimal setup for a normal Python project. Or at least, a minimal one the way I do it. Needless to say, I’ll be discussing many of these files, in particular,
version.py in this post. Most of the other files are either self explanatory or contained in another post.
Setup and Meta
The first step is to configure your project with the necessary setup and meta data files. The first and most important of these is the
setup.py file which will use the other meta files in the project. Basically, I just copy and paste the following file into all my projects and modify as needed. Apparently this is just a thing Python developers do.
So there is a lot going on here, but you can see that the basic meta information is right at the top. I hoped to top load this file so that copy and paste would be as easy as possible. A couple of notes:
- The license can just be the name of the license like “MIT” or “Apache” — the
LICENSE.txtfile will spell everything out.
- The GitHub repository is important; particularly because the download url is formed from a tag, v + the version number.
- The classifiers must be selected from Python Classifiers.
get_versionfunction must be stored in a file called
version.pysuch that the
setup.pyscript can read the file and
execit without accidentally importing any dependencies.
- Unfortunately, PyPI doesn’t display Markdown, so for the long description (which is displayed on the PyPI project page) I have created a file called
DESCRIPTION.rstwhich is in reStructuredTxt.
- The setup script uses the
find_packagesfunction to discover the contained packages (which allows you to easily create packages with multiple top level modules). Therefore you need to tell it which directories not to look in, as specified by
- The script,
bin/app.pywill be installed to the
$PATHof the user installing the program, but is not included as a module.
I probably do need to break down these notes a bit more, but they are for reference here since I tend to speed write these posts. Check back later, maybe I’ll have updated them!
Configuration and Manifest
setup.cfg file allows you to specify other configurations. In my case it looks like this (assuming a Python 2 and 3 compatible package):
[metadata] description-file = README.md [wheel] universal = 1
Basically the metadata tag is an attempt to get the Markdown README into the package, but it doesn’t really work (sadly). The manifest lists all the other files that should be included in the package when uploading to PyPI. Mine looks like this:
include *.md include *.txt include *.yml include Makefile recursive-include docs *.md recursive-include docs *.jpg recursive-include tests *.py recursive-include bin *.py
Final Notes on Configuration
I find it really annoying that you have to create an extra description file for PyPI. Everywhere I read says that you should just put a reStructuredTxt file in as your README, but then of course GitHub doesn’t work. I prefer GitHub working, so I go with Markdown. You could write a script to do a conversion with Pandoc, but is it really worth the effort? In the future I’ll find a way to manage this a bit better.
If you want the files and directories from
MANIFEST.in to also be installed (e.g. fixtures or data for machine learning or database setup), you will have to set
include_package_data=True in your
Building and Submitting
Basically there are two phases to submitting a project to PyPI: build and upload. During the upload phase you first send to PyPI Test to make sure everything is good, then send to PyPI live.
First build the package for distribution along with the binary wheel distribution:
$ python setup.py sdist bdist_wheel
This will create a
build directory with the binary distribution, a
foo.egg-info directory with packaging information, and finally a
dist directory with two packages, the versioned distribution (
foo-0.1.tar.gz) and the wheel (
foo-0.1-py2-none-any.whl). Note that if you’re using my
make clean will clean up all of this extra stuff, but it should be ignored in your
At this point you can (and should) test both the wheel and the sdist package by creating a virtual environment and attempting to install the package with
pip directly as follows:
$ virtualenv venv ... $ source venv/bin/activate $ pip install dist/foo-0.1.tar.gz $ python >>> import foo >>> print foo.__version__ 0.1 >>> exit() $ deactivate $ rm -rf venv
The first step to submitting your package to an index server is to register it.
$ python setup.py register -r pypitest
-r flag here specifies which index server you wish to use as listed by the
.pypirc file. We can then upload the package with twine, which is the currently preferred method of uploading due to its security (TLS) and ability to prebuild and test. If you don’t have twine setup, simply
pip install it.
$ twine upload -r pypitest dist/foo-0.1*
Note that you can also sign the package with a GnuPG key with the
-s option, but we will skip that for now. Once again, we should test our packages with a virtual environment as above, but this time downloading them from PyPI Test directly:
$ pip install -i https://testpypi.python.org/pypi foo
Once this is done and everything is ready to rock, you can repeat the process for uploading to the package to PyPI, simplified here as follows:
$ python setup.py register $ twine upload dist/foo-0.1*
Assuming you’re using MkDocs as recommended then you can upload this documentation as follows:
$ mkdocs build --clean $ python setup.py upload_docs --upload-dir=site
You’ll probably want to clean up after yourself, which is as simple as
make clean if you’re using my Makefile. If you’d like to do it with bash it’s as follows:
$ find . -name "*.pyc" -print0 | xargs -0 rm -rf $ rm -rf htmlcov $ rm -rf .coverage $ rm -rf build $ rm -rf dist $ rm -rf foo.egg-info
Also you should probably remove that
site folder created by the documentation build.
Hopefully this post makes your life easier by giving you a simple guide to push new packages to PyPI. I know I shoot fast and loose with some of the stuff, but the post was super long anyway. If you’re really looking for awesome integrations, checkout How to Travis-CI Deploy for automatic deployment after testing.