We’ve recently faced an issue when deploying a repository containing submodules while using Code Pipeline. First, let’s understand how Code Pipeline pulls data from Github.
Let’s say we have a very simple deployment process like this:
1- CodePipeline pulls a specific branch from a repository
2- CodeBuild receives the files and the build/compiling process happens with a buildspec.yml file.
OK, so ideally what we would do is simply run
git submodules init and
git submodules update --recursive in our buildspec.yml file and it should be all set, right?
Not really. When CodePipeline pulls the files from your repository and passes to CodeBuild, it actually doesn’t include the
.git directory, that being said, we can’t perform any Git actions because it is no longer is a Git repository.
After spending some time researching we came across this Stack Overflow answer https://stackoverflow.com/a/54318204/4241974, which helped to solve our problem.
We set up a “Machine User” account in Github https://developer.github.com/v3/guides/managing-deploy-keys/#machine-users.
This “Machine User” is just another Github account that contains an SSH key. The SSH key will be used to grant access to the repository, and pull the submodules within CodeBuild.
So the first thing to do is generate a new SSH key (https://help.github.com/en/github/authenticating-to-github/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent) and store those keys in a safe place in AWS. For that, we used the AWS Systems Manager and Parameter Store.
By opening up the AWS System Manager service, and creating a new Parameter Store, it allows us to store Strings, that’s where you would store your id_rsa and id_rsa.pub.
Now that we have the SSH keys stored in AWS, we can have a buildspec.yml file to initialize the Git repo and pull the submodules.
First thing, use version 0.2 for your buildspec file.
We also need to declare a few environment variables. In our case,
git_url is the remote origin of the repository, and in the
parameter-store section, we have two variables, one to identify each SSH key that we have in AWS System Manager that we created previously. Like this:
# buildspec.ymlversion: 0.2env:
git_url: "YOUR_GIT_URL (using firstname.lastname@example.org)"
Now we’ll be able to access those variables in the
Here are the steps in order:
mkdir -p ~/.ssh create the .ssh directory
echo "$ssh_key" > ~/.ssh/id_rsa adds the content of the Parameter Store to the
echo "$ssh_pub" > ~/.ssh/id_rsa.pub adds the content of the Parameter Store to the public key file.
chmod 600 ~/.ssh/id_rsa give the appropriate permissions
eval “$(ssh-agent -s)" adds the ssh key to ssh-agent
From step 6, all it does is initialize the Git repo by adding the remote origin defined in the
git_url variable and checkout the branch that CodePipeline pulled using the internal
After that, you’re good to get the submodules from your repository and use the
build phase to compile, test or build your code.
- mkdir -p ~/.ssh
- echo "$ssh_key" > ~/.ssh/id_rsa
- echo "$ssh_pub" > ~/.ssh/id_rsa.pub
- chmod 600 ~/.ssh/id_rsa
- eval "$(ssh-agent -s)"
- git init
- git remote add origin "$git_url"
- git fetch origin
- git branch
- git checkout -f "$CODEBUILD_RESOLVED_SOURCE_VERSION"
- git submodule init
- git submodule update --recursive
# compile and build here...
That’s pretty much it. Until CodePipeline supports submodules natively, this workaround helped us.