About pre-commit when working with monorepos
Working with monorepos may get complicated as the different parts of your application may need different tools of different settings for the same tools. This is in particular true when the monorepo contains parts written in different languages, but is not limited to this case.
This all might be ok, when you write your own scripts to manage the different parts of your monorepo. But when it comes down to pre-commit hooks we all like to use tools like pre-commit
(see https://pre-commit.com/). Sadly pre-commit
has no support for monorepos and has decides not to support them.
There are alternatives like Mookme (https://mookme.org/), but those fail to provide the main benefit of pre-commit
— automatically creating and managing different environments to run the necessary tools. Without this the hook get coupled to the packages you install inside the repository or depend on packages installed in your system. Both should not be the preferred solution. So back to pre-commit
…
The issue with pre-commit
First let’s look at whats the issue in the first place, let’s say we have this folder structure (JS frontend + Python backend):
src/
frontend/
package.json
backend/
pyproject.toml
Now each of the different parts of the project will want to do things like linting in the pre-commit hooks. This means we need to add files for thos things.
src/
frontend/
package.json
.eslintrc.js
tsconfig.json
stylelint.config.js
backend/
pyproject.toml
.flake8
pytest.init
It makes sense to put those files into the the folders where the different parts of your app live, too. This is just to keep things clearly separated.
Now let’s say in your .flake8
config you want to define to ignore some of the rules for certain files. You could for example put database migrations into a folder named migrations/
, which will automatically generated by some tool (for example alembic
) and by default do ignore some of your normal rules (like missing type annotations). Let’s add this to the .flake8
config:
per-file-ignores =
migrations/*.py: ANN
This all is fine when you run flake8
inside of src/backend/
…but pre-commit
just won’t let you do that. Instead it will ALWAYS run all commands inside your repository root. This is a wanted feature for normal repositories, as this ensures a clean run of all the hooks and follows the normal pre-commit hook behaviour. But in this particular case (in general every time you want to run hooks with pre-commit
in any subfolder) this breaks how the pre-commit hooks work. Now the ignored rules will ne longer be ignored when using pre-commit
— which is a bad thing.
You could easily fix this by changing the configuration to something like this:
per-file-ignores =
migrations/*.py: ANN
src/backend/migrations/*.py: ANN
But seriously, who wants that?
What can we do?
I tried different methods to solve this issue. You could define local hooks and use shell scripts to do something like cd src/backend && flake8 "$@"
for example. But all of these “solution” boil down to not using the hooks already existing for pre-commit
and by doing so the need to do a lot of work just for your own repository again. Also — if the solution involves adding shell scripts — all solutions i came up with meant to have additional setup inside your repository.
In addition I don’t really like the way pre-commit
forces you to use one central config file, at least when it comes to monorepos. What I would like is something like this:
src/
frontend/
package.json
.pre-commit-config.yaml
backend/
pyproject.toml
.pre-commit-config.yaml
.pre-commit-config.yaml
Meaning each of your different software parts manages their own hooks while the repository root is focused on the global hooks like commit message linting (for example using commitlint).
So the workarounds just did not feel right, at least to me.
What if…?
But what if we could just run pre-commit
in sub folders and it would then only handle the files inside this folder? This would allow a setup like shown above.
Let’s just try this. Let’s say we have this src/backend/pre-commit-config.yaml
repos:
- repo: https://github.com/PyCQA/flake8
rev: 5.0.4
hooks:
- id: flake8
args: ["--config=.flake8"]
- repo: meta
hooks:
- id: identity
The identity hook is just for debugging purposes, I’ll come back to this in a bit.
If you use pre-commit run --files some_file.py
you will get this output:
$ pre-commit run --files some_file.py
flake8...................................................................Failed
- hook id: flake8
- exit code: 1There was a critical error during execution of Flake8:
The specified config file does not exist: .flake8identity.................................................................Passed
- hook id: identity
- duration: 0.02ssrc/backend/some_file.py
flake8
failed because it couldn’t find the config file, which is strange, as it is just there (trust me). But the identity hook tells us what happened:
Although we passed some_file.py
as a parameter pre-commit
converted this into src/backend/some_file.py
. What happened is that pre-commit
will convert everything in a way that the command can be run inside the repository root.
This is a (necessary) feature of pre-commit
, see the source for the implementation details: https://github.com/pre-commit/pre-commit/blob/db51d3009f5cbeee6aafdc3e7c0cbbd2627a1a78/pre_commit/main.py#L152
So we cannot use this schema?
How about using a special hook to solve this?
pre-commit
hooks are pretty easy to write and use. So why not create a hook that allows for this schema to work. The hook would be included in the repository root pre-commit-config.yaml
and then call pre-commit
in a way inside the subfolders that it won’t mess up your special use case.
Thats why I created sub-pre-commit
(https://github.com/ddanier/sub-pre-commit). It just provides such a hook. Your root configuration could look like this:
repos:
- repo: https://github.com/ddanier/sub-pre-commit.git
rev: v2.20.0-3 # MUST match your pre-commit version
hooks:
- id: sub-pre-commit
alias: frontend
name: "pre-commit for src/frontend/"
args: ["-p", "src/frontend"]
files: "^src/frontend/.*"
stages: ["commit"]
- id: sub-pre-commit
alias: backend
name: "pre-commit for src/backend/"
args: ["-p", "src/backend"]
files: "^src/backend/.*"
stages: ["commit"]
It will now cd
into the folders passed using -p
to sub-pre-commit
and then change the way pre-commit
is called so it won’t change the parameters and cd
back into the repository root again.
See the README for additional details on how this works.
Summary
pre-commit
is really not designed to work with monorepos, although the issues might not be visible from the start. I also think managing different .pre-commit-config.yaml
files for each part of the repository is a mandatory feature.
sub-pre-commit
tries to just fill the gap pre-commit
left us with. I hope this helps anybody and would like to receive any feedback — positive or negative.