Gitlab-CI/CD and GitLab PyPi Repository

Benjamin Berriot
Maisons du Monde
Published in
5 min readApr 30, 2020

At Maisons du Monde we are using Gitlab as our main and only code management platform. On the Data team we have two main types of repositories: applications (API or data pipeline) and components (generic code that can be reused by multiple applications). Our components are private and we can’t upload them on PyPi. Luckily, a few days ago, April 22nd, Gitlab released their last version 12.10 (https://about.gitlab.com/releases/2020/04/22/gitlab-12-10-released/) introducing PyPi Repository. In this article, we will see how to create our package, to upload it into Gitlab and to automate the process with Gitlab CI/CD.

How to create and publish a python package on Gitlab PyPi Repository?

The first step is to create a repository including a valid python package inside. At the root of our repository, we need to add a setup.py looking like that:

You can check if the package is valid by running this command at the root of your project:
python setup.py bdist_wheel

This command will create a dist folder, containing a .whl file:
<package-name>-<version>-py3-none-any.whl

You can now install the package by running:
pip install <package-name>-<version>-py3-none-any.whl

Now that we have a functional package, we want to share it across our different users. We can achieve that by publishing the package on Gitlab.

In order to help us publishing the package on Gitlab, we used Twine (https://pypi.or/project/twine/), a tool used to upload packages on PyPi or other repositories. Let’s install Twine :
pip install twine

Let’s create a file ~/.pypirc. This file is used to add more repositories to Twine. By default Twine upload packages on PyPi. But in our case we want to upload our packages on our Gitlab project. To achieve that we need to add this file ~/.pypirc with the following:

Your project_id can be found on the home page of your project, or in Settings > General. To create a personal access token you can follow this guide: https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html. The personal access token needs the API scope.

Now it’s time to upload our package on Gitlab, with the following command:
python -m twine upload --repository gitlab dist/<package-name>-<version>-py3-none-any.whl

Side note, you can manage multiple repositories in your .pypirc file, like that:

You can for example upload your package on testpypi with the command:

python -m twine upload --repository testpypi dist/<package-name>-<version>-py3-none-any.whl

Getting back to Gitlab: if you go in your project, in the tab Packages, you can now find your freshly created package, ready to use.

Now it’s time to automate the process with Gitlab CI/CD

Firstly go inside Settings > CI/CD > Variables and add a new variable:

The value of the variable PYPIRC is the value of the file .pypirc created above. Note: the type of the variable is File.

Now create a file .gitlab-ci.yml at the root of your project as follow:

The goal of this file is to define our Gitlab pipeline. In this case we are created a step named build-package in our pipeline. It will build our python package and upload it to Gitlab thanks to Twine. You can find more information about Gitlab-ci here: https://docs.gitlab.com/ee/ci/

This pipeline will be triggered as soon as a tag is created, and we will use the tag value in our pipeline as a parameter with ${CI_COMMIT_TAG}.

In this case in the command python -m twine upload --repository gitlab dist/mypackage-${CI_COMMIT_TAG}-py3-none-any.whl --config-file /tmp/.pypirc , the value of ${CI_COMMIT_TAG} will be 0.0.1, which mean we are looking for the file mypackage-0.0.1-py3-none-any.whl.

Warning: if you use the tag value in your pipeline (like in this article) the tag name must match with the version provided in the file setup.py . Which mean, when you release a new version of you package, before tagging, update the file setup.py.

How to use our new package?

We have an application running inside a container, and we want to install our freshly created package. Update your requirements.txt file like that:

The --extra-index-url option to pip install makes pip looking in official PyPi package repository before looking in our Gitlab repository. This is useful when you have multiple packages in your requirements.txt, some on PyPi and others in your private repository. An other option exists in pip, --index-url , but this option force pip to look in one specific repository. If you don’t use requirements.txt file (or you have only packages from one repository in your requirements.txt file), you can install your package with the command:
pip install --index-url https://__token__:<personal_access_token>@gitlab.com/api/v4/projects/<your_project_id>/packages/pypi/simple your_package_name

If we go back to our application, the Dockerfile is looking like that:

We can build the container by running:
docker build --build-arg GITLAB_PIP_TOKEN=mysupertoken .

In this case the value of variable GITLAB_PIP_TOKEN will be replaced by the value mysupertoken , and when we perform the pip install, the --extra-index-url in our requirements.txt file, is looking like that:
--extra-index-url https://__token__:mysupertoken@gitlab.com/api/[...] .

Final words

We are really happy that this feature is now implemented in Gitlab. It’s really simple to use and close to what we do when publishing a package on PyPi. This feature also works if you have your own Gitlab repository: you simply need to replace “gitlab.com” by your custom gitlab url within the different files.

To build our container, on Maisons du Monde’s Data team, we are using Google Cloud Builder. We create a container-builder.yml:

And we submit the build using this command:
gcloud builds submit --config=container-builder.yml \
--substitutions=_GITLAB_TOKEN="mysupertoken"

You can find more informations about Google Cloud Builder here: https://cloud.google.com/cloud-build/docs/configuring-builds/substitute-variable-values

References:

--

--