Working with Git Submodules in CodePipeline

We’ve recently faced an issue when deploying a repository containing submodules while using Code Pipeline. First, let’s understand how Code Pipeline pulls data from Github.

Let’s say we have a very simple deployment process like this:

1- CodePipeline pulls a specific branch from a repository
2- CodeBuild receives the files and the build/compiling process happens with a buildspec.yml file.

Image for post
Image for post

OK, so ideally what we would do is simply run git submodules init and git submodules update --recursive in our buildspec.yml file and it should be all set, right?

Not really. When CodePipeline pulls the files from your repository and passes to CodeBuild, it actually doesn’t include the .git directory, that being said, we can’t perform any Git actions because it is no longer is a Git repository.

After spending some time researching we came across this Stack Overflow answer, which helped to solve our problem.


We set up a “Machine User” account in Github

This “Machine User” is just another Github account that contains an SSH key. The SSH key will be used to grant access to the repository, and pull the submodules within CodeBuild.

So the first thing to do is generate a new SSH key ( and store those keys in a safe place in AWS. For that, we used the AWS Systems Manager and Parameter Store.

By opening up the AWS System Manager service, and creating a new Parameter Store, it allows us to store Strings, that’s where you would store your id_rsa and

Image for post
Image for post

Now that we have the SSH keys stored in AWS, we can have a buildspec.yml file to initialize the Git repo and pull the submodules.

First thing, use version 0.2 for your buildspec file.
We also need to declare a few environment variables. In our case, git_url is the remote origin of the repository, and in the parameter-store section, we have two variables, one to identify each SSH key that we have in AWS System Manager that we created previously. Like this:

Now we’ll be able to access those variables in the pre_build phase.

Here are the steps in order:

1- mkdir -p ~/.ssh create the .ssh directory
2- echo "$ssh_key" > ~/.ssh/id_rsa adds the content of the Parameter Store to the id_rsa file.
3- echo "$ssh_pub" > ~/.ssh/ adds the content of the Parameter Store to the public key file.
4- chmod 600 ~/.ssh/id_rsa give the appropriate permissions
5- eval “$(ssh-agent -s)" adds the ssh key to ssh-agent

From step 6, all it does is initialize the Git repo by adding the remote origin defined in the git_url variable and checkout the branch that CodePipeline pulled using the internalCODEBUILD_RESOLVED_SOURCE_VERSION variable.

After that, you’re good to get the submodules from your repository and use the build phase to compile, test or build your code.

That’s pretty much it. Until CodePipeline supports submodules natively, this workaround helped us.

Written by

I’m a Software Engineer, and here I share some of the things I learn on a daily basis.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store