Automate a full end-to-end CI/CD Pipeline with Microsoft Azure and Talend

Talend
Microsoft Azure
Published in
7 min readOct 14, 2019

Talend continues to evolve and simplify the implementation of CI/CD. In this blog, I will show how with the availability of Talend Cloud on Microsoft Azure, we can offer full automation, zero-installation and end-to-end delivery of your Continuous Integration and Delivery Pipelines to make all your DevOps goals achievable.

A Brief History

Before I get into the details of building an end-to-end CI/CD Pipeline on Microsoft Azure, let’s look back at our CI/CD journey. Back when Talend rolled out our first Continuous Integration capabilities in the Summer of 2016, it was just the beginning of an evolution to quicken the time to delivery for integration projects. By combining unit testing capabilities and Maven compatibility standards into Talend Studio, data integrators could utilize familiar CI Orchestration tools such as Jenkins to automatically build, unit test and package their Talend jobs into an artifact repository. It was a great first step but didn’t quite address the aspects of Continuous Delivery. Further, it required significant configuration within Jenkins as well as additional, manual software installations.

Fast-forward to today where data-driven companies are constantly striving for more data insights and pressuring the development teams for quick and reliable data processes. In order to meet these growing demands, IT organizations are embracing the DevOps methodologies, including a full, end-to-end CI/CD pipeline with zero downtime.

What’s the difference?

Integrating Talend into the CI/CD process is nothing new. In fact, earlier this year, I wrote about leveraging Microsoft Azure DevOps and Talend to build a CI/CD Pipeline. But this required some heavy lifting to build and configure a hosted agent in the Azure platform to host the Talend CI Builder services as described in this article. Just a few months later, Talend offered a zero-install CI feature that eliminated the need for the heavy lifting and made Continuous Integration in the cloud easier than ever. Now with our latest capabilities of Talend Cloud on Microsoft Azure, we can combine the best of both worlds, Azure DevOps Orchestration engine with Talend’s zero-install CI feature and new native cloud services, all on the Azure Platform, to achieve the ultimate goal of Continuous Integration, Continuous Deployment and Continuous Delivery in a fully automated way.

The Technical Details

Let’s look at a concrete example. We are going to create a CI pipeline in Azure Pipelines levering the zero install CI feature in order to use Microsoft-hosted agents. The following steps are basically the same as the ones described in my previous article: Building a CI/CD pipeline with Talend and Azure DevOps except for setting up the agent. In this article we won’t have to setup the agent, we will use one provided by the Azure DevOps platform. Our goal will be to publish our jobs to Talend Cloud thanks to our CI pipeline.

We won’t go in full details here as the first steps have already been detailed in the previous article. Here we only are going into the specifics of using the zero install CI.

Files can be found on GitHub here.

1. Pre-requisites

We have two main pre-requisites here:

  • We still need to setup a Nexus to sync our third-party libraries from Talend Studio and Nexus to make the CI work. This has been detailed here so please refer to Configure the Nexus for third-party libraries within that blog.
  • The second requirement is the need to host what we call the P2 repository. I explained what it is and what it is for in this article. So, once you downloaded it from your license email or from Talend Cloud you just need to host it on a regular HTTP server or in Azure Blob Storage for instance.

If you want to host it in Azure Blob Storage check out the following steps, otherwise skip this part:

a) Create Storage Account and enable Static files hosting:

b) Copy and paste the Primary endpoint. This is going to be the URL to point to the P2 repository. A $web container is created. Upload all the unzipped, P2 repository contents into this newly created Blob Storage folder. Within a few seconds your P2 repository is hosted and available.

2. Azure Pipelines setup

First, we need to specify variables in Azure Pipelines (Pipelines -> Library -> + Variable Group):

You will need to set:

  • CLOUD_PASSWORD -> Talend Cloud user password you use to login
  • CLOUD_URL -> Talend Cloud URL depending on your datacenter location
  • CLOUD_USERNAME -> Talend Cloud user email (with domain if you use one) you use to login
  • NEXUS_URL -> Nexus URL used to sync the third-party libraries
  • NEXUS_PASSWORD
  • NEXUS_USERNAME
  • UPDATESITE_PATH -> URL to the hosted P2 repository (needs to be accessible from outside)

Remark: You can also use a personal access token instead of your username/password:

Use the parameter: -Pcloud.token in the pipeline below and only set a CLOUD_TOKEN variable.

Then add two secure files, your 7.2.1+ platform license file as well as a settings.xml configuration file for Maven. This file is using parameters coming from the variables we just set in Azure Pipelines before. Don’t forget to authorize the secure files to be used in the pipelines.

<

3. CI Pipeline

We will use a YAML file to create our CI pipeline in Azure Pipelines. This will be our unique file to configure in order to make our pipeline work successfully. You can find it here in the GitHub repository.

Let’s go through it step by step.

trigger: none

The trigger attribute allows you to define when the pipeline should be actioned. In our case we decided to disable any triggering, but you can completely enable it at each commit to your repository.

pool: vmImage: 'ubuntu-latest'

The pool described where your pipeline will be run. Here we use the ‘ubuntu-latest’ virtual machine image. It’s a hosted agent that will be provided by Azure to run your build. It has Java, Maven, Docker installed among much more.

variables: - group: Talend Configuration - name: project_name value: 'PUT_YOUR_PROJECT_NAME' - name: job_name value: 'put_your_job_name' - name: job_version value: '0.1'

These are the variable we have set for the pipeline. The first is a group and correspond to the variable group we created above. The rest are particular variables depending on the job you want to build and deploy.

steps: - task: DownloadSecureFile@1 name: settings_xml inputs: secureFile: settings.xml - task: DownloadSecureFile@1 name: license inputs: secureFile: license - task: Maven@3 inputs: mavenPomFile: '$(project_name)/poms/pom.xml' mavenOptions: | -Dlicense.path=$(license.secureFilePath) -Dupdatesite.path=$(UPDATESITE_PATH) -Dservice.url=$(CLOUD_URL) -Dcloud.publisher.screenshot=true -Xmx3096m -Xmx1024m options: '--settings $(settings_xml.secureFilePath) -Pcloud-publisher -pl jobs/process/$(job_name)_$(job_version) -am' goals: 'deploy'

Now comes the real deal. The pipeline will start with downloading the two files we uploaded before, the settings.xml file for Maven configuration and the license file.

The last task is the Maven command. As you can see, we specified the UPDATESITE_PATH URL. It means that when the task will be run in our hosted agent, it will download and install the CommandLine automatically and then build and publish our job to Talend Cloud as mentioned by the -Pcloud-publisher option. To keep it simple we did not add options to publish to an environment or workspace in particular. By default, it will publish your artifacts to the default defined for your account.

At this point your CI pipeline on Azure Pipelines is ready to be run. If everything goes right, you should see your artifact in your Talend Cloud account published in your default environment.

4. Optimization

The current pipeline is downloading the CommandLine at each build. If you run builds from time to time it can be acceptable. But if you run a lot of builds you might want to avoid this and speed up the process. One solution would be to cache the CommandLine in Azure DevOps workspace. Fortunately, Azure DevOps team recently released (in beta) pipeline caching. It can be used for the CommandLine as well as the Maven local repository (.m2) if you regularly build the same type of jobs.

Conclusion

Not too long ago, implementing a CI/CD pipeline seemed like an arduous task with a heavy stack of software installations and complex configurations. Recently, Talend considerably simplified the effort by offering a zero-install CI feature. Today, through our partnership with Microsoft, we can now offer our Talend Cloud integration application on the Azure Cloud Platform and take full advantage of both solutions to provide a fully automated, end-to-end CI/CD Pipeline.

To learn more, visit the Azure solution page

Thibaut Gourdel is Technical Product Marketing Jr at Talend since 2017. His area of interest includes cloud technologies, containerization, serverless computing and data stream processing.

Originally published at https://www.talend.com on October 14, 2019.

--

--

Talend
Microsoft Azure

Official news and insights from Talend, a leader in data integration for cloud and big data.