Creating a pipeline in Azure DevOps to build and publish python packages/artifacts

Published in

Raa Labs

14 min readOct 17, 2019

With Azure DevOps you can easily create sophisticated pipelines for your projects to ensure that the quality of your code and development process is coherent. In this guide, we will look at how we can develop a CI/CD pipeline to integrate package creation and sharing in a simple and scalable manner.

Very often, Microsoft publishes fragmented steps on how to conduct specific tasks without providing important surrounding information for beginners to get things up and running. Assuming that the reader is an expert will often lead to articles that are hard to rely on as online resources for working examples. In this article, we will distill everything into digestible information that is easy to grasp, even for beginners.

During this article, we will dive into these topics:

Preparing your code for a python project
Creating a python project locally
Difference between build and release pipelines
How to setup a build pipeline
Setting up unit tests in a build pipeline
Package creation in a build pipeline
Publishing artifacts in a build pipeline
Setting up a release pipeline to publish new artifacts with new verions
How to connect to feed and using your new package with python

Preparing your code for a python project

If you want to upload your package to PyPi, I recommend this Medium article:

How to upload your python package to PyPi

The PyPi package index is one of the properties that makes python so powerfull: With just a simple command, you get…

medium.com

A lot of the steps are the same, and we will go through how we can locally prepare our repository to have the necessary files in order to create the package in the next topic.

The first thing we need, is a repository. Raa Labs (the company I work for) uses Github and has a Time Series Insight Client package available in Github. We will use this example going forward. Once you have created an account on Github and uploaded your package code there, we can check if you have all the necessary files.

When you have your code in a repository, make sure to put your class files into a folder in your root directory with the same name as your class file. That folder will contain your classes while other necessary files will be listed in the root together with the class folder.

Screenshot of Class file within the folder within the Github repository

Example with needed files:

TSIClient (Folder — name of my application)
→ TSIClient.py (my class file)
→ __init__.py
setup.py
setup.cfg
LICENSE.txt
README.md

init.py

As you can see, there is a __init__.py file (two underscored on each side). This file should contain nothing other than an import statement for your class.

from packagename.Filename import Classname

Example:

Init Code

The init file shows the access direction to your classes for the package interface.

Setup.py

Setup.py contains information about the package inself, and you are free to put in whatever information that suits your package. When creating new releases, this is the file you would use to update the package version. There are tons of examples which you can directly copy from. My setup.py looks like this for the TSIClient:

https://gist.github.com/readyforchaos/bb367b1c1a291431d37998a212d06ca1

Update the fields to suit your package.

In the install_requires field, you can fill out all the external python dependencies that your package requires to run successfully. In my case, I use pandas and requests as imports in my code, therefore I need to add them under install_requires within setup.py. Even though I import JSON as well in my code, I don’t need to list this as it comes built in with python.

Setup.cfg is a simple file with some short metadata that links to your README.md file for description.

[metadata]
description-file = README.md

LICENSE.txt

Please insert your license type for the package. In most cases, you will be using the MIT license but I recommend you read up on what license suits your project. For the MIT type, use the skeleton below while changing out the appropriate text with the current year and your name:

README.md

Last but not least, is the markdown readme file. This file should contain all your instructions for use in a markdown format and is referred to from the long_description variable in the setup.py file. The long_description is also the description that will be used in PyPi on the frontpage if you ever decide to upstream your package.

Adam pritchard has created a good markdown cheatsheet which you can use as your basis if you are new to markdown.

Creating a python project locally

At this point in time, you have successfully prepared your code to be packaged into a source distribution and distributed across whatever site you prefer. Congratulations! Now we will go ahead and create the actual source distributions that you can share.

Start with navigating from CMD into the folder where you have all your files.

cd c:/path_to_your_folder

From here, we will use python and sdist to create a source distribution from the setup.py file that we created earlier on.

python setup.py sdist

If this does not work, you can use Anaconda prompt or any other distribution to do the same. If you want to use CMD, just make sure that you have the appropriate python PATH set within your environment variables on windows.

There should now be a new folder names dist which contains your new source distribution! 🎁

Uploading package to PyPi

If you want to upload it to PyPi, simply write two more commands and follow the on screen instructions, and you will have your package uploaded to PyPi:

pip install twine

Twine is used for the upload process. Now that we have it installed, we can go ahead and upload the distribution with twine.

twine upload dist/*

Python does not allow for reusage of distributions however, therefore you should use the following command to avoid a “file already exists” error when updating your package.

twine upload --skip—existing dist/*

Difference between build and release pipelines

Now that we have created a package and know how to deal with it, we are going to dive into the world of DevOps. In DevOps, build and release are two commonly used names which we should understand before going forward. Also knowing how to differentiate those two words, is beneficial for us.

The user garchangle had a good summarized explanation on reddit about the difference between a build and a release pipeline:

Build is taking your source and compiling/testing it. This should be done with some regularity and you aren’t necessarily expected to keep these builds. Its sort of a dev or team of devs checking to make sure the project they are building works like they expect. These can sometimes be called snapshot builds. You might deploy these builds to an environment, but only to test.
Releasing is the process of creating a permanent set of your code, compiled or noncompiled. This is when you have hit a milestone of some sort, time or feature wise. These releases are versioned and kept forever and once you start building more complex systems, you can refer to applications by this version for inter-dependencies.

And if you are wondering how to seggregate Continuous Integration and Continuous Deployment in a build and release type of setting, the user by the name of myxored also added wisely to the reddit thread:

Build:
CI: The build-pipeline produces snapshot builds (and does e2e/it/unit tests ) of your code into artifacts which can be
CD: deployed on staging systems, test system or whatever you like
Release
CI: usually promotes a already existing build artifact to a release. This usually adds a semantic version, puts that release into a nexus / artifactory of your kind to be consumed later
CD: deploys a release on a set / all production systems
As you see, CI/CD are very different from Release/Build, CD can be described as “ship it to an external, permanent server” ( or sort alikes, hard to define ).

How to setup a build pipeline

The first thing you need, is an account on Azure DevOps.

After you have set up your account, you need to create a new project within your organization.

After setting up your project, you can go to pipelines and build.

Click on new build and click Github.

As you can see from the top row, there are a couple of steps we need to go through. One of them is configure, and we will look into how the YAML configuration looks like. Once you have selected your repository, you need to give some permissions for the Azure Pipelines app from the Github Marketplace. Click approve and install.

.YML file

Now we need to configure our YAML file. This file works as the configuration file for our pipeline. The pipeline will read the YAML file from top to toe, so whatever code snippets we put in the YAML, the pipeline will run those consecutively.

This is how my YAML looks like. We will distill the file in a moment:

Distilling and understanding the YAML file

Trigger

Trigger sets which branch to trigger on from the repository. You can set this branch to be whatever you like. Mine is set to trigger the pipeline on the master branch.

Pool

The pool states which VM image and strategy the pipeline should run on. You can set this to run multiple python versions on specific images. Mine is set to run on ubuntu-latest with only python version 3.6.

Steps

Steps are the actual tasks that the pipeline should run. Here, you are free to set up as many steps as you like. The steps can involve any task from the tasks list, or you can write the yaml yourself if you are comfortable. In my yaml, I have the following steps:

Use specific python version
Install dependencies
Install pytest and run pytest
Artifact creation
Copy artifacts
Publish artifact

1. Use specific python version

steps:
- task: UsePythonVersion@0  
      inputs:
        versionSpec: '$(python.version)'  
        displayName: 'Use Python $(python.version)'

The code above tells the pipeline to use the python version from the variable that was defined from the pool section. In my case, python 3.6.

2. Install dependencies

- script: |    
    python -m pip install --upgrade pip
  displayName: 'Install dependencies'

The script above installs all the dependencies needed while making sure that pip is up to date

3. Install pytest and run pytest

- script: |    
    pip install pytest pytest-azurepipelines    
    pytest  
  displayName: 'pytest'

The first command above installs the external pytest library. The second command simply runs the pytest module. We will go through how we can setup unit tests for our code in the next section. By adding the pytest command into your yaml, you will run the tests you have written.

4. Artifact creation

- script: |    
    python setup.py sdist 
  displayName: 'Artifact creation'

The code above is similar to what we wrote initially when creating a package from the source code. The artifact gets created in this stage.

5. Copy artifacts

- task: CopyFiles@2  
  inputs:    
    targetFolder: $(Build.ArtifactStagingDirectory)

The copy files task takes the artifact newly created and copies it to a target directory which is the build artifact staging directory variable. This is important to do so that we have control over where the artifact is currently stored so that we can publish the artifact form the same directory variable.

6. Publish artifact

- task: PublishBuildArtifacts@1  
  inputs:    
    PathtoPublish: '$(Build.ArtifactStagingDirectory)'                  
    ArtifactName: 'dist'    
    publishLocation: 'Container'

Last but not least, we take the dist artifact from the publish directory and publish the package named dist into Container. By doing this, we can pick it up from the release pipeline and proceed further.

Congratulations! You have now created a fully functional build pipeline!

In the end, this is how the job should look like after running it successfully:

You can even see that a source distribution by the name of “dist” was created.

By clicking on a step from the job, you can see further details from the log.

Bu clicking on the “test”-tab, you can see deeper details about the tests that were run.

But how do we add tests to the code repository? We will answer this question in the next section.

Setting up unit tests in a build pipeline

Setting up tests with pytest is super simple. The only thing you need to do is to start your new file in which you want your tests to be with the word “test”.

Eg.

test_collection.py

Then commit this file into your repository.

In the build pipeline we are executing the installation of pytest and running pytest in step 3. With the integrated test discovery mechanism, pytest will automatically pick up the testfile when the word “test” stands as the filenames prefix. Make sure to put all of your test files in the root directory. If you put them in a folder, make sure to point the pipeline in the right direction.

Eg. of a simple assertion for squareroot of 25.

import pytest
import math def test_sqrt():   
    num = 25   
    assert math.sqrt(num) == 5

When you have your test-file you are free to add the following into your existing yaml file which will be commited to your code repository And run by the pipeline:

- script: |    
    pip install pytest pytest-azurepipelines    
    pytest  
  displayName: 'pytest'

Setting up a release pipeline to publish new artifacts with new package versions

In order to create your first release pipeline, go to Pipelines and Release, then click on “new release”.

Start by giving your release pipeline a proper name. The pipeline will contain zero stages initially and cannot be run. We will fix this in a moment.

Go into the options-tab and revise how you want the version naming convention to be.

The following code:

Release-$(rev:r)

will yield sequential release names like:

eg.

Release-19

You can edit the naming schemes if wanted. Here is a list from Microsoft with possible masks:

Click on the blank artifact section to add a connection from the build artifact to the release pipeline. Pick the Build source and pay attention to the artifact name.

You can at this point decide whether the release pipeline should be triggered each and every time the build pipeline has been run, or you can choose to schedule it for specific timings.

Now we are ready to create a new stage. Click on “new stage” and give your stage an appropriate name. I named mine “Publish new artifact”. You can create multiple stages. One example is to have a production stage, another one for development and the last for staging. Within your stages, you can have multiple tasks, just like in the build pipeline. In my stage, I have three tasks:

Before we dive into the tasks of publishing the package, we need to create a “feed”. Go to the “artifacts” section and click on “add new feed”.

Give your feed a name and choose the access control. Go back to the steps section in the release pipeline and click on tasks.

Click on “Add a task to Agent job”.

Search for “Python twine upload authentication” and “Command Line”. Make sure to add the tasks to the list to make it match to what I have done below:

Select the newly created feed.

In the second command line task, write a short script to install twine through python.

The last command line script is a bit tricky. This is the actual publication script that will take the package from a specific path the build source distribution is located at, and upload it to the particular feed with the PYPIRC_PATH environment variable that contains the twine authentication from the first task you created.

twine upload -r {NAME_OF_FEED} --config-file $(PYPIRC_PATH) d:\a\r1\a\_{NAME_OF_ORGANIZATION}.{PACKAGE_NAME}\dist\dist\*

You are now ready to create a release! 🎉

Based on how you triggered your release pipeline, either make a commit to the code repository on Github or deploy the release manually.

By the end of the release, this is how it should look like if all went well:

You can go to the artifacts section and find your newly published package. 📦

You can click on the package and read more about how to use it if you wrote a README.md file. Click on the “connect to feed” button if you want to connect your environment to the current feed and install the package.

NB! Only after you have connected to the feed, will you be able to pip install your package. Unless the package is upstreamed to a public index like PyPi, you need to connect your environment to the feed before you can pip install.

How to connect to feed and using your new package with python and pip.ini

After clicking on “connect to feed”, a popup appears that tells you how to connect to the feed. My personal understanding of this was that it felt a bit lacking in information. I had never used a pip.ini file myself, so after tons of researching I finally found the best way of creating a pip.ini file to add the Python generated credentials.

Installation guide for anaconda virtual environment.

Open notepad or other text editor program.
From Azure DevOps, click on Generate Python credentials for a pip.ini file
Copy the python credentials into the text editor
Save as pip.ini (no textfile)
Go to C:\Users\User.Name\.conda\envs\Your-env (Example: C:\Users \Anders.Gill\.conda\envs\DevOpsEnv)
Save the pip file in the aboveforementioned path
Go to command terminal for your virtual environment
pip install NAME_OF_PACKAGE==current-version (1.0.2 for example)
Installation successful

Congratulations, you have now successfully created a fully functional CI/CD pipeline for your project! Hope you enjoyed this guide and will recommend it forward. 🎉

Feel free to ask any questions in the comments below.