Documenting Your Data-Science Project — A Guide To Publish Your Sphinx Code Documentation

Lydia Nemec
7 min readAug 15, 2019


Figure 1 Leonardo da Vinci’s Visionary Notebook the Codex Arundel. It is available at the British Library. [1]

In this article, we like to give you a step by step guide on how to document your Python Data Science project effectively as part of your machine learning model development. You find our example code on GitHub. The solution, we propose, ensures that your documentation is version controlled, shipped with the source code executing your machine learning experiment and made available to your users or co-workers using generally available tools including Sphinx, GitHub, Azure DevOps, and Azure Web App.

Before we start with the technical implementation, we would like to share our thoughts on the necessity and peculiarities of Data Science project documentation. Data Science projects include writing code and using machine learning libraries e.g. TensorFlow, sci-kit learn, PyTorch. In Data Science, we develop software for data preparation, for machine learning model training and testing as well as result evaluation. At least in part, we can treat Data Science projects as if they were software development projects with an additional dynamic component — the data. Therefore, adopting best practices from the software development community into our Data Science projects is a good start. Documentation paradigms in software development aim to support continuous code development, maintenance, and knowledge transfer between developers. For more thoughts about documenting software, we recommend reading the blog written by Andrew Goldis [2].

The Workflow: An Overview

In this tutorial, we will connect 4 different tools or services, namely Sphinx, GitHub, Azure DevOps, and Azure Web App.

Figure 2 Workflow demonstrating how the various tools and services connect to one automated pipeline

In Figure 2, we sketched the automated workflow from the local sphinx installation to the published documentation. One of the major advantages of the solution provided in this blog is the ability to control access to your documentation facilitating the power of Azure Active Directory sign-in. In particular, in a corporate environment, this can be a crucial point. The code used in this example can be found on GitHub.

The Workflow

  • ensures that your documentation is automatically updated whenever new code is committed to GitHub or Azure DevOps.
  • publishes the documentation to the Microsoft Azure cloud with the option of full access control.
  • allows for user management allowing access to the documentation either to anonymous or authenticated users.


Getting sphinx up and running

In case you do not have a working sphinx environment, we recommend the sphinx documentation and tutorial. In a new python project, we use the following sphinx commands in combination:

Remember to run your python setup script before you build the documentation. (python install)

Connecting GitHub and Azure DevOps

Connecting GitHub with Azure DevOps is easy to setup:

  1. Sign in to Azure Boards
  2. Choose Project Settings
  3. Choose GitHub connections
  4. Click on “Connect your GitHub account”
  5. Sign in using your GitHub credentials
  6. Select the GitHub repository you like to connect
  7. Choose Save
  8. Review the GitHub page and then choose “Approve, Install, & Authorise”
  9. Confirm with your GitHub password

A comprehensive and detailed description of how-to connect Azure DevOps to GitHub is given here

Setting up the Documentation Build Pipeline

Following best practices and the experiences of the software development community, we will keep the build and deploy/release pipelines separate. An overview of Azure DevOps pipelines can be found here. In this section, we focus on the build pipeline. Our pipeline will go through 5 steps (see Figure 3).

Figure 3 A graphical representation of the build pipeline. The pipeline is defined in a YAML file (see on GitHub) and defines the steps necessary to automatically build a sphinx generated documentation.

First, the python packages installed and upgraded. Next, the python package we like to document is set up. Then the Sphinx generated documentation is built and then the created HTML files are copied into the artifact directory.

Let’s get started:

  • Sign in to Azure DevOps and navigate to your project
  • Navigate to the Pipelines page (left column)
  • Choose the action to create a new pipeline
  • Walkthrough the steps of the wizard and select GitHub as the location of your source code.
  • You might be redirected to GitHub to sign in. If so, enter your GitHub credentials
  • Select the git repository from the list
  • Configure your pipeline by selecting the YAML file from the repository (see Fig. 4)
Figure 4 Screenshot of Azure DevOps selecting the YAML file from the Git repository.
  • Review the pipeline defined in the YAML file and click run (see Fig. 5). The full YAML file can be found in the GitHub repository.
Figure 5 Screenshot of Azure DevOps showing the imported YAML file for review and the option to run it.
  • To see your build select Azure DevOps ➢ Pipelines ➢ Builds ➢ select the latest build (Fig. 6)
Figure 6 Screenshot of Azure DevOps showing the history of executed build pipelines. The top CI build is still running as indicated by the blue icon.
  • The details of the pipeline are listed and the generated HTML files can be found under artifacts ➢ drop (Fig. 7 top right)
Figure 7 Screenshot of Azure DevOps showing the details of the executed build pipeline. The detailed logs of each build step can be accessed through this interface.

Setting up Microsoft Azure Web App

Let’s use the Azure cloud shell, that we do not need to install the Azure CLI.
Create a resource group, then the Azure App Service Plan and create the Azure Web App.

Now, you can visit your created web service with the browser of your choice.

Figure 8 Screenshot of the empty Microsoft Azure App Service.

If you like to control access to the web app and in the end your documentation, you can use the power of Microsoft Azure Active Directory sign-in. For example, this becomes necessary often in a corporate environment. Good documentation can be found here. Having the possibility to use the power of the Azure Active Directory is certainly a huge advantage of the solution presented in this blog.

Connecting Azure DevOps and Azure Web App Service

To be able to set up the release pipeline, we need first to connect Azure DevOps with Azure Web App Service. In short, you need to go through the following steps:
a) Project settings
b) Service connection
c) Choose “+ New Service Connection”
d) Select the connection type: Azure Resource Manager

A detailed description can be found here. If you are working with your company subscription, you may not have the necessary rights to set-up a new service principal, you may need to ask your local administrator or IT to do it for you.

Setting up the Documentation Release Pipeline

In the build pipeline, sphinx generated the HTML files containing the documentation. In the final step, we will deploy the web app via a release pipeline (doing it in the build pipeline is dirty, remember that build != deployment). Let’s get started with setting up the :

  • Sign in to Azure DevOps and navigate to your project
  • Navigate to the Releases page (left column)
  • Choose the action “+ New release pipeline”
  • You can select Featured templates or an empty job
  • Choose the pipeline “Azure App Service deployment” and click “apply”
  • You can rename “stage 1” e.g. to “DeploySphinxDocu” and close the window (Fig. 9)
Figure 9 Screenshot of Azure DevOps showing the details of the release pipeline with the option to rename the stage.
  • We renamed the pipeline to “Release Latest Sphinx Documentation”
  • Click on view stage tasks “1 job, 1 task”
  • Setup the task (see Fig. 10):
    a) Connection type: Azure Resource Manager
    b) Azure subscription: Use the connection you set up in the previous section (it should show up in the drop-down menu)
    c) App Service type: Web App on Windows
    d) App Service name: use the name you defined in the section (Setting up Microsoft Azure Web App)
    e) Adjust the field Package or folder to point to the artifact folder (remember in the build pipeline we defined it as drop)
Figure 10 Screenshot of Azure DevOps release pipeline defining the deployment of the HTML files to the Azure App Service
  • Select “Pre-deployment conditions” and choose “After release” as your selected trigger (Fig. 11)
Figure 11 Screenshot of Azure DevOps release pipeline setting the release trigger and add a filter to the git master branch.
  • Set up Artifact filters. Choose Type: Include & Build branch: master
Figure 12 Screenshot of Azure DevOps release pipeline setting the continuous deployment trigger
  • Edit Artifacts
  • Set continuous deployment trigger
  • You can manually trigger a release by choosing “create release” (top right)
    a) Click create release
    b) You can see you release in the Release overview
    c) Select the latest release to monitor your progress

Finally, you can visit the Web app and see your data Science documentation.

Figure 13 Screenshot of the automatically generated and released Data Science project documentation. (Link)

Every time you push new code to the master branch new documentation will be build using sphinx and then deployed to the Microsoft Web App. That way your documentation will always be up to date.

Special Thanks

Special thanks for support to Timo Klimmer you find him on LinkedIn and Twitter.



Lydia Nemec

I am the Head of ZEISS AI Accelerator with a background in computational physics, numerics and machine learning bridging the way from research to innovation.

Recommended from Medium


See more recommendations