Remarkably Restrained


the title is misleading


Python setuptools' MANIFEST.in explained

TL;DR: don’t use MANIFEST.in when packaging in Python using setuptools; use the setuptools_scm package instead.


In the mess that is Python packaging using setuptools, some things are actually best understood in their historical context. One of those things is the file MANIFEST.in.

A world without VCS

Remember the times from before (general adoption of) version control systems? This is the world MANIFEST.in was devised in.

In that world, imagine you want to distribute your Python project in the form of a package of source files for others to use. That is, want to make a source distribution, A.K.A. sdist.

Such a distribution will be based on the directory you’re developing the project in. However, because this is your local working directory, it will contain all kinds of files which are not part of the project’s source, but part of your own development process such as the configuration of your IDE or files left behind by your text editor. Which of those files should be included as part of your distribution?

Of course, disutils can guess, and in fact it does guess. Some such guesses are “all Python source files implied by the py_modules and packages options”, README.txt and setup.py. However, at some point guessing will be insuffienct.

This is were MANIFEST.in comes in: simply specify which files to include using various inclusion and exclusion patterns.

In the context of building (using setup.py build or some derivate thereof such as setup.py bdist_wheel) we are faced with more or less the same problem: which data files in the project should be included in the build? Thus, MANIFEST.in was used as the answer to that question too – as long as you set the parameter include_package_data to True.

Enter the VCS

In 2019, we don’t live in a world from before the general adoption of version control systems. If you have your project in source control, the question “which files constitute the source code of this project?”, as opposed to “which files are local to a particular developer’s environment?” is already answered: the files under source control constitute the source code.

This is also the position setuptools seems to take, since it introduced automatic inclusion of files under source control in the then popular SVN in 2005. This behavior was later generalized to be able to support arbitrary version control systems using plugins and the svn-specific implementation was even removed at some point.

If you read this in 2019, your VCS is most likely either git or hg, in which case the package setuptools_scm provides the plugins you need. Add the following incantation to setup.py to ensure that files you have under source control are packaged:

#!python
setup(
    ...,
    use_scm_version=True,
    setup_requires=['setuptools_scm'],
    ...,
) 

And no more need for MANIFEST.in!

Caveats

Does this really mean we can ditch MANIFEST.in altogether? Almost… a few final caveats:

  • If you don’t use an scm, you’ll still need MANIFEST.in. Counterpoint: you should really-really-really use an scm, even for the smallest of projects.

  • setuptools_scm has more functionality than determining which files to include in your package. It also derives the version number for the package from scm’s tags. There is no package that provides the former functionality, but not the latter. Counterpoint: you should really tag versions in your version management tool, and you should really not wish to duplicate this behavior manually elsewhere, so there is no need for a package which provides just one of these behaviors.

  • There may be cases in which your concept of a “source distribution” differs from “all the stuff under source control”. Common examples are tests, debugging tools, shell scripts, documentation, etc. The counterpoint is that you really shouldn’t try to make such a distinction. This is the position Jason R. Coombs takes:

In my opinion the sdist is meant to be more than just a copy of the Python functionality, but is meant to be a distributable copy of the source code. I would expect someone to be able to download the sdist, extract it, and develop on the project much like they would if cloning the repo.

Note that this final “problem” applies exclusively to source distributions: builds (and wheels) are automatically limited to files living under the packages as specified by the packages parameter.