Tools for maintaining clean Python projects

Published: 20 Sep 2020

A few months ago I started a new job and as part of my role I am expected to encourage better Python coding standards among the other programmers across teams in two countries. Most of them have backgrounds writing software quickly and without much regard for the conventions of the technologies that they use.

This has made me think quite a lot about what high quality code actually looks like, but also how the writing of good code can be encouraged within an existing team that is unfamiliar with the conventions of Python.

The argument for why high quality code is important is easy to make. Code that has been proven to work, is paired with accurate documentation and is written in a clear, concise and consistent style is easier to maintain. Code that is easier to maintain saves time and thus resources that can be better spent elsewhere. What is more difficult is actually fostering the culture to develop software to this standard.

Python has an advantage over many other languages when it comes to readability. The Python grammar is familiar to speakers of english and there is very little in the way of verbose boilerplate. Unfortunately some of the language's most popular qualities also provide the conditions to spawn some truly horrifying codebases.

I think the first step when working in teams is to introduce code review. This forces programmers, at the very least, to look at the work of their colleagues and hopefully think about how they would do it differently. The review step should also help catch any major issues that may not have been apparent to the author or that were missed. Code review should focus on the way problems have been solved and the implementation details.

The problem with code review is that reviewers apply their own heuristics to the code that they are evaluating. Any analysis by a human is going to result in unrepeatable and inconsistent results.

I have now come around to idea that using automated tooling to evaluate code quality is a necessary minimum requirement for ensuring that standards are maintained.

In the rest of this post I will discuss some of the tools that I feel are most useful for Python projects.

Pytest (Testing)

Showing that a software system actually does what it was intended to do is the most important of all metrics when it comes to evaluating the quality of a software project. I won't go into all of the benefits of automated testing here, but I will say that tests should be mandatory and test coverage should be as close to universal as feasible.

Pytest is my preferred testing framework for Python. I also like to pair it with the Pytest-cov extension, which evaluates the level of test coverage in your project.

# Install
pip install pytest pytest-cov

# Run
pytest --cov

Create a .coveragerc file and add the following config:

[run]
source = my_project

[report]
exclude_lines =
    # Have to re-enable the standard pragma
    pragma: no cover

    # Don't complain about missing debug-only code:
    def __repr__
    if self\.debug

    # Don't complain if tests don't hit defensive assertion code:
    raise AssertionError
    raise NotImplementedError

    # Don't complain if non-runnable code isn't run:
    if 0:
    if __name__ == .__main__.:

fail_under=95

Black (Formatting)

Black claims to be an "uncompromising" code formatter. It has very few configuration options by design. Black takes as an input Python code and outputs code formatted to a strict subset of the PEP8 specification. The advantage of using black is that there can be no disagreements over coding styles.

# Install
pip install black

# Format
black --line-length=79 /project/path

# Check code formatting
black --check --line-length=79 /project/path

I personally set the maximum line-length to 79 as this follows PEP8.

You can configure black to do this by default by creating a pyproject.toml file and adding the following.

[tool.black]
line-length = 79
target-version = ['py36', 'py37', 'py38']
include = '\.pyi?$'
exclude = '''
/(
    \.eggs
  | \.git
  | \.hg
  | \.mypy_cache
  | \.tox
  | \.venv
  | _build
  | buck-out
  | build
  | dist
)/
'''

Isort (Import Sorting)

One shortcoming of black is that it doesn't sort imports. Isort is a standalone import sorting tool that pairs nicely with black.

# Install
pip install isort

# Format
isort -m3 --tc --line-length=79 /project/path

# Check formatting
isort -m3 --tc --line-length=79 -c /project/path

In order for isort to work with black it's important to use the m3 (multi-line output mode 3) and tc (trailing comma) flags.

Alternatively you can add the following configuration to setup.cfg.

[isort]
multi_line_output=3
include_trailing_comma=True
line_length=79

Flake8 (Linting)

Flake8 is a tool that performs two useful checks. Firstly it checks that your source code complies with the PEP8 Style Guide but it will also calculate the cyclomatic complexity.

# Install
pip install flake8

# Check PEP8
flake8 /project/path

# Check Cyclomatic Complexity
flake8 --max-complexity=10 /project/path

It's necessary to ignore warnings for E203, E266, E501 and W503 as black formatted code doesn't comply with these.

Put this into setup.cfg:

[flake8]
ignore = E203, E266, E501, W503
max-line-length = 79
max-complexity = 10

Mypy (Static Type Checking)

One of my favourite features in Python 3.x is the inclusion of type hints. Static typing is helpful for documenting the behaviour of code, but it also allows errors to be identified at compile-time. Mypy is a static type-checker for python that does exactly that.

# Install
pip install mypy

# Run
mypy /project/path

An example of type hints:

def is_pos(x: int) -> bool:
  return x % 2 == 0

Add the following config to setup.cfg (replace my_project with the name of your code's directory):

[mypy]
files=my_project,tests
ignore_missing_imports=true

Interrogate (Docstring Coverage)

Documentation is important and documentation inside of the code base is far more likely to be kept current than documentation stored anywhere else. Interrogate is a tool that "interrogates" code for docstring coverage.

# Install
pip install interrogate

# Run verbose
interrogate -v

# Run with configuration
interrogate -c pyproject.toml
# or
interrogate -c setup.cfg

Add the following to your pyproject.toml or setup.cfg file:

[tool.interrogate]
ignore-init-method = true
ignore-init-module = false
ignore-magic = false
ignore-semiprivate = false
ignore-private = false
ignore-property-decorators = false
ignore-module = false
fail-under = 95
exclude = ["setup.py", "docs", "build"]
ignore-regex = ["^get$", "^mock_.*", ".*BaseClass.*"]
verbose = 0
quiet = false
whitelist-regex = []
color = true

Pre-commit (Git Hooks)

With so many different tools it can be easy to forget to run one, or even run any of the tools. I find it helpful to trigger all of the tools at once with a git pre-commit hook.

The "pre-commit" package makes configuring pre-commit hooks easy.

# Install
pip install pre-commit

To configure pre-commit create a .pre-commit-config.yaml file in the root of the git repository and populate it with the following:

default_language_version:
    python: python3.7
default_stages: [commit, push]
repos:
  - repo: local
    hooks:
      - id: pytest
        name: pytest
        language: system
        entry: pytest
        types: [python]
        args: [--cov]
        pass_filenames: false

      - id: black
        name: black
        language: system
        entry: black --check
        types: [python]
        args: [--check]

      - id: isort
        name: isort
        language: system
        entry: isort
        types: [python]
        args: [-c, .]

      - id: flake8
        name: flake8
        language: system
        entry: flake8
        types: [python]
        exclude: setup.py

      - id: mypy
        name: mypy
        language: system
        entry: mypy
        types: [python]
        pass_filenames: false

      - id: interrogate
        name: interrogate
        language: system
        entry: interrogate
        types: [python]
        args: [-c, pyproject.toml]

Then install the git hook with pre-commit install

Adding to your build scripts

In order to gain the full benefit of these tools it's important to integrate them with your build pipelines. Any code committed should pass each of the above steps before it even reaches code review, let alone a production server.

Example

An example repository can be found on github here.