Pre-commit

Using the pre-commit git hooks to ensure checks before committing.

Goku Mohandas

Subscribe to our newsletter

📬 Receive new lessons straight to your inbox (once a month) and join 40K+ developers in learning how to responsibly deliver value with ML.

Intuition

Before performing a commit to our local repository, there are a lot of items on our mental todo list, ranging from styling, formatting, testing, etc. And it's very easy to forget some of these steps, especially when we want to "push to quick fix". To help us manage all these important steps, we can use pre-commit hooks, which will automatically be triggered when we try to perform a commit. These hooks can ensure that certain rules are followed or specific actions are executed successfully and if any of them fail, the commit will be aborted.

Installation

We'll be using the Pre-commit framework to help us automatically perform important checks via hooks when we make a commit.

We'll start by installing and autoupdating pre-commit (we only have to do this once).
pre-commit install
pre-commit autoupdate

Config

We define our pre-commit hooks via a .pre-commit-config.yaml configuration file. We can either create our yaml configuration from scratch or use the pre-commit CLI to create a sample configuration which we can add to.

# Simple config
pre-commit sample-config > .pre-commit-config.yaml
cat .pre-commit-config.yaml

Hooks

When it comes to creating and using hooks, we have several options to choose from.

Built-in

Inside the sample configuration, we can see that pre-commit has added some default hooks from it's repository. It specifies the location of the repository, version as well as the specific hook ids to use. We can read about the function of these hooks and add even more by exploring pre-commit's built-in hooks. Many of them also have additional arguments that we can configure to customize the hook.

# Inside .pre-commit-config.yaml
...
-   id: check-added-large-files
    args: ['--maxkb=1000']
    exclude: "notebooks"
...

Be sure to explore the many other built-in hooks because there are some really useful ones that we use in our project. For example, check-merge-conflict to see if there are any lingering merge conflict strings or detect-aws-credentials if we accidentally left our credentials exposed in a file, and so much more.

And we can also exclude certain files from being processed by the hooks by using the optional exclude key. There are many other optional keys we can configure for each hook ID.

# Inside .pre-commit-config.yaml
...
-   id: check-yaml
    exclude: "mkdocs.yml"
...

Custom

Besides pre-commit's built-in hooks, there are also many custom, 3rd party popular hooks that we can choose from. For example, if we want to apply formatting checks with Black as a hook, we can leverage Black's pre-commit hook.

# Inside .pre-commit-config.yaml
...
-   repo: https://github.com/psf/black
    rev: 20.8b1
    hooks:
    -   id: black
        args: []
        files: .
...

This specific hook is defined under a .pre-commit-hooks.yaml inside Black's repository, as are other custom hooks under their respective package repositories.

Local

We can also create our own local hooks without configuring a separate .pre-commit-hooks.yaml. Here we're defining two pre-commit hooks, test-non-training and clean, to run some commands that we've defined in our Makefile. Similarly, we can run any entry command with arguments to create hooks very quickly.

# Inside .pre-commit-config.yaml
...
-   repo: local
    hooks:
    -   id: clean
        name: clean
        entry: make
        args: ["clean"]
        language: system
        pass_filenames: false

Commit

Our pre-commit hooks will automatically execute when we try to make a commit. We'll be able to see if each hook passed or failed and make any changes. If any of the hooks fail, we have to fix the errors ourselves or, in many instances, reformatting will occur automatically.

check yaml..............................................PASSED
clean...................................................FAILED

In the event that any of the hooks failed, we need to add and commit again to ensure that all hooks are passed.

git add .
git commit -m <MESSAGE>

Run

Though pre-commit hooks are meant to run before (pre) a commit, we can manually trigger all or individual hooks on all or a set of files.

# Run
pre-commit run --all-files  # run all hooks on all files
pre-commit run <HOOK_ID> --all-files # run one hook on all files
pre-commit run --files <PATH_TO_FILE>  # run all hooks on a file
pre-commit run <HOOK_ID> --files <PATH_TO_FILE> # run one hook on a file

Skip

It is highly not recommended to skip running any of the pre-commit hooks because they are there for a reason. But for some highly urgent, world saving commits, we can use the no-verify flag.

# Commit without hooks
git commit -m <MESSAGE> --no-verify

Highly recommend not doing this because no commit deserves to be force pushed no matter how "small" your change was. If you accidentally did this and want to clear the cache, run pre-commit run --all-files and execute the commit message operation again.

Update

In our .pre-commit-config.yaml configuration files, we've had to specify the versions for each of the repositories so we can use their latest hooks. Pre-commit has an autoupdate CLI command which will update these versions as they become available.

# Autoupdate
pre-commit autoupdate

Upcoming live cohorts

Sign up for our upcoming live cohort, where we'll provide live lessons + QA, compute (GPUs) and community to learn everything in one day.

To cite this content, please use:

@article{madewithml,
    author       = {Goku Mohandas},
    title        = { Pre-commit - Made With ML },
    howpublished = {\url{https://madewithml.com/}},
    year         = {2023}
}