About pre-commit when working with monorepos

David Danier
5 min readSep 4, 2022

--

Working with monorepos may get complicated as the different parts of your application may need different tools of different settings for the same tools. This is in particular true when the monorepo contains parts written in different languages, but is not limited to this case.

This all might be ok, when you write your own scripts to manage the different parts of your monorepo. But when it comes down to pre-commit hooks we all like to use tools like pre-commit (see https://pre-commit.com/). Sadly pre-commit has no support for monorepos and has decides not to support them.

There are alternatives like Mookme (https://mookme.org/), but those fail to provide the main benefit of pre-commit — automatically creating and managing different environments to run the necessary tools. Without this the hook get coupled to the packages you install inside the repository or depend on packages installed in your system. Both should not be the preferred solution. So back to pre-commit

The issue with pre-commit

First let’s look at whats the issue in the first place, let’s say we have this folder structure (JS frontend + Python backend):

src/
frontend/
package.json
backend/
pyproject.toml

Now each of the different parts of the project will want to do things like linting in the pre-commit hooks. This means we need to add files for thos things.

src/
frontend/
package.json
.eslintrc.js
tsconfig.json
stylelint.config.js

backend/
pyproject.toml
.flake8
pytest.init

It makes sense to put those files into the the folders where the different parts of your app live, too. This is just to keep things clearly separated.

Now let’s say in your .flake8 config you want to define to ignore some of the rules for certain files. You could for example put database migrations into a folder named migrations/, which will automatically generated by some tool (for example alembic) and by default do ignore some of your normal rules (like missing type annotations). Let’s add this to the .flake8 config:

per-file-ignores =
migrations/*.py: ANN

This all is fine when you run flake8 inside of src/backend/ …but pre-commit just won’t let you do that. Instead it will ALWAYS run all commands inside your repository root. This is a wanted feature for normal repositories, as this ensures a clean run of all the hooks and follows the normal pre-commit hook behaviour. But in this particular case (in general every time you want to run hooks with pre-commit in any subfolder) this breaks how the pre-commit hooks work. Now the ignored rules will ne longer be ignored when using pre-commit — which is a bad thing.

You could easily fix this by changing the configuration to something like this:

per-file-ignores =
migrations/*.py: ANN
src/backend/migrations/*.py: ANN

But seriously, who wants that?

What can we do?

I tried different methods to solve this issue. You could define local hooks and use shell scripts to do something like cd src/backend && flake8 "$@" for example. But all of these “solution” boil down to not using the hooks already existing for pre-commit and by doing so the need to do a lot of work just for your own repository again. Also — if the solution involves adding shell scripts — all solutions i came up with meant to have additional setup inside your repository.

In addition I don’t really like the way pre-commit forces you to use one central config file, at least when it comes to monorepos. What I would like is something like this:

src/
frontend/
package.json
.pre-commit-config.yaml
backend/
pyproject.toml
.pre-commit-config.yaml
.pre-commit-config.yaml

Meaning each of your different software parts manages their own hooks while the repository root is focused on the global hooks like commit message linting (for example using commitlint).

So the workarounds just did not feel right, at least to me.

What if…?

But what if we could just run pre-commit in sub folders and it would then only handle the files inside this folder? This would allow a setup like shown above.

Let’s just try this. Let’s say we have this src/backend/pre-commit-config.yaml

repos:
- repo: https://github.com/PyCQA/flake8
rev: 5.0.4
hooks:
- id: flake8
args: ["--config=.flake8"]
- repo: meta
hooks:
- id: identity

The identity hook is just for debugging purposes, I’ll come back to this in a bit.

If you use pre-commit run --files some_file.py you will get this output:

$ pre-commit run --files some_file.py
flake8...................................................................Failed
- hook id: flake8
- exit code: 1
There was a critical error during execution of Flake8:
The specified config file does not exist: .flake8
identity.................................................................Passed
- hook id: identity
- duration: 0.02s
src/backend/some_file.py

flake8 failed because it couldn’t find the config file, which is strange, as it is just there (trust me). But the identity hook tells us what happened:
Although we passed some_file.py as a parameter pre-commit converted this into src/backend/some_file.py. What happened is that pre-commit will convert everything in a way that the command can be run inside the repository root.

This is a (necessary) feature of pre-commit, see the source for the implementation details: https://github.com/pre-commit/pre-commit/blob/db51d3009f5cbeee6aafdc3e7c0cbbd2627a1a78/pre_commit/main.py#L152

So we cannot use this schema?

How about using a special hook to solve this?

pre-commit hooks are pretty easy to write and use. So why not create a hook that allows for this schema to work. The hook would be included in the repository root pre-commit-config.yaml and then call pre-commit in a way inside the subfolders that it won’t mess up your special use case.

Thats why I created sub-pre-commit (https://github.com/ddanier/sub-pre-commit). It just provides such a hook. Your root configuration could look like this:

repos:
- repo: https://github.com/ddanier/sub-pre-commit.git
rev: v2.20.0-3 # MUST match your pre-commit version
hooks:
- id: sub-pre-commit
alias: frontend
name: "pre-commit for src/frontend/"
args: ["-p", "src/frontend"]
files: "^src/frontend/.*"
stages: ["commit"]
- id: sub-pre-commit
alias: backend
name: "pre-commit for src/backend/"
args: ["-p", "src/backend"]
files: "^src/backend/.*"
stages: ["commit"]

It will now cd into the folders passed using -p to sub-pre-commit and then change the way pre-commit is called so it won’t change the parameters and cd back into the repository root again.

See the README for additional details on how this works.

Summary

pre-commit is really not designed to work with monorepos, although the issues might not be visible from the start. I also think managing different .pre-commit-config.yaml files for each part of the repository is a mandatory feature.

sub-pre-commit tries to just fill the gap pre-commit left us with. I hope this helps anybody and would like to receive any feedback — positive or negative.

--

--

David Danier

Python Web Developer from Germany, focusing on FastAPI and Django.