Pre-commit
Repository
📬 Receive new lessons straight to your inbox (once a month) and join 30K+ developers in learning how to responsibly deliver value with ML.
Intuition
Before performing a commit to our local repository, there are a lot of items on our mental todo list, ranging from styling, formatting, testing, etc. And it's very easy to forget some of these steps, especially when we want to "push to quick fix". To help us manage all these important steps, we can use pre-commit hooks, which will automatically be triggered when we try to perform a commit.
Though we can add these checks directly in our CI/CD pipeline (ex. via GitHub actions), it's significantly faster to validate our commits before pushing to our remote host and waiting to see what needs to be fixed before submitting yet another PR.
Installation
We'll be using the Pre-commit framework to help us automatically perform important checks via hooks when we make a commit.
# Install pre-commit
pip install pre-commit==2.19.0
pre-commit install
And we'll add this to our setup.py
script instead of our requirements.txt
file because it's not core to the machine learning operations.
1 2 3 4 5 6 7 8 9 |
|
Config
We define our pre-commit hooks via a .pre-commit-config.yaml
configuration file. We can either create our yaml configuration from scratch or use the pre-commit CLI to create a sample configuration which we can add to.
# Simple config
pre-commit sample-config > .pre-commit-config.yaml
cat .pre-commit-config.yaml
1 2 3 4 5 6 7 8 9 10 |
|
Hooks
When it comes to creating and using hooks, we have several options to choose from.
Built-in
Inside the sample configuration, we can see that pre-commit has added some default hooks from it's repository. It specifies the location of the repository, version as well as the specific hook ids to use. We can read about the function of these hooks and add even more by exploring pre-commit's built-in hooks. Many of them also have additional arguments that we can configure to customize the hook.
1 2 3 4 5 6 |
|
Be sure to explore the many other built-in hooks because there are some really useful ones that we use in our project. For example,
check-merge-conflict
to see if there are any lingering merge conflict strings ordetect-aws-credentials
if we accidentally left our credentials exposed in a file, and so much more.
And we can also exclude certain files from being processed by the hooks by using the optional exclude key. There are many other optional keys we can configure for each hook ID.
1 2 3 4 5 |
|
Custom
Besides pre-commit's built-in hooks, there are also many custom, 3rd party popular hooks that we can choose from. For example, if we want to apply formatting checks with Black as a hook, we can leverage Black's pre-commit hook.
1 2 3 4 5 6 7 8 9 |
|
This specific hook is defined under a .pre-commit-hooks.yaml inside Black's repository, as are other custom hooks under their respective package repositories.
Local
We can also create our own local hooks without configuring a separate .pre-commit-hooks.yaml. Here we're defining two pre-commit hooks, test-non-training
and clean
, to run some commands that we've defined in our Makefile. Similarly, we can run any entry command with arguments to create hooks very quickly.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
View our complete .pre-commit-config.yaml
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.3.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
exclude: "config/run_id.txt"
- id: check-yaml
exclude: "mkdocs.yml"
- id: check-added-large-files
args: ['--maxkb=1000']
exclude: "notebooks"
- id: check-ast
- id: check-json
- id: check-merge-conflict
- id: detect-aws-credentials
- id: detect-private-key
- repo: https://github.com/psf/black
rev: 22.3.0
hooks:
- id: black
args: []
files: .
- repo: https://github.com/PyCQA/flake8
rev: 3.9.2
hooks:
- id: flake8
- repo: https://github.com/PyCQA/isort
rev: 5.10.1
hooks:
- id: isort
args: []
files: .
- repo: https://github.com/asottile/pyupgrade # update python syntax
rev: v2.34.0
hooks:
- id: pyupgrade
args: [--py36-plus]
- repo: local
hooks:
- id: test
name: test
entry: make
args: ["test"]
language: system
pass_filenames: false
- id: clean
name: clean
entry: make
args: ["clean"]
language: system
pass_filenames: false
Commit
Our pre-commit hooks will automatically execute when we try to make a commit. We'll be able to see if each hook passed or failed and make any changes. If any of the hooks failed, we have to fix the corresponding file or in many instances, reformatting will occur automatically.
... detect private key.....................................PASSED black..................................................FAILED ...
In the event that any of the hooks failed, we need to add
and commit
again to ensure that all hooks are passed.
git add .
git commit -m <MESSAGE>

Run
Though pre-commit hooks are meant to run before (pre) a commit, we can manually trigger all or individual hooks on all or a set of files.
# Run
pre-commit run --all-files # run all hooks on all files
pre-commit run <HOOK_ID> --all-files # run one hook on all files
pre-commit run --files <PATH_TO_FILE> # run all hooks on a file
pre-commit run <HOOK_ID> --files <PATH_TO_FILE> # run one hook on a file
Skip
It is highly not recommended to skip running any of the pre-commit hooks because they are there for a reason. But for some highly urgent, world saving commits, we can use the no-verify flag.
# Commit without hooks
git commit -m <MESSAGE> --no-verify
Highly recommend not doing this because no commit deserves to be force pushed no matter how "small" your change was. If you accidentally did this and want to clear the cache, run
pre-commit run --all-files
and execute the commit message operation again.
Update
In our .pre-commit-config.yaml
configuration files, we've had to specify the versions for each of the repositories so we can use their latest hooks. Pre-commit has an autoupdate CLI command which will update these versions as they become available.
# Autoupdate
pre-commit autoupdate
We can also add this command to our Makefile
to execute when a development environment is created so everything is up-to-date.
# Makefile
.ONESHELL:
venv:
python3 -m venv venv
source venv/bin/activate && \
python3 -m pip install --upgrade pip setuptools wheel && \
python3 -m pip install -e ".[dev]" && \
pre-commit install && \
pre-commit autoupdate
To cite this content, please use:
1 2 3 4 5 6 |
|