Cloud-Native Development: Chemical & Consumer Goods Company

Published in

RUBICON Stories

10 min readJul 27, 2020

Digital Transformation: RUBICON Develops a Secure Cloud-Native Platform for a Multinational Chemical & Consumer Goods Company

In November 2019, we partnered with a Multinational Chemical & Consumer Goods Company to help them digitally transform their business. As a company with over 50,000 employees, the client was ready to put an end to their traditional operations and automate their business activities. The first step towards digitizing their business was to find a way for the company’s employees and its partners to easily store, access, and share substantial amounts of data on one platform instead of using multiple tools. Our team brought forward a complete software solution for the client’s needs — developing a web and cloud native platform to store and access data in a secure and efficient manner.

The Client

The Chemical & Consumer Goods Company has headquarters in western Europe and offices spread all around the world. Positioned in a number of markets in the world, ranging from beauty care to laundry and home care, the company has been around for over 100 years.

The Project

Ready to digitally transform their business and automate their everyday internal company tasks, the client needed a safe web and cloud platform for data to easily be stored and exchanged within the company.

What was RUBICON’s task?

RUBICON’s goal was to develop a generic and highly modular web-based portal that enables users to access and exchange relevant data located in Azure cloud (from Data Lake, Data Warehouse, and other sources) in a highly secure and well-managed approach.

The platform we created is called the Data Lake Management Portal where all company data is managed in one place — the cloud.

The Challenges

While developing the portal, our team overcame the following challenges:

Developing a cloud web-based explorer that enables users quick and easy access to data and the ability to share and store data in Data Lake;
Implementing cloud scalable architecture to serve at least a 1000 peak users in parallel and to serve more than 10 million files with 30,000 new files per day;
Creating a Single Sign-On (SSO) authentication and authorization using Azure Active Directory and integration with other services and APIs within the organization;
Integrating with organization’s big data pipelines, APIs, and other services;
Designing an intuitive and modern web application;
Developing a Single-page application (SPA) for a fast and smooth user experience.

The Software Development Process

In November 2019, after being briefly introduced to the client’s project, our team organized a Lean Inception Workshop with both our team and the client’s team in their main office. The goal behind the workshop was to create the very first version of the product for the first phase of development. During the four days, both teams collaborated to understand the client’s demands. We worked together to define the product vision and goals, explore features and user journeys, and review the technical/business/UX components. By the fourth day of the workshop, we had release version 01 ready for development with a set of clear guidelines and features.

Once we created the product roadmap and introduced the idea to the RUBICON team, we began onboarding the team members dedicated to the project. The roles that made up the team were:

Frontend Developer
Backend Developer
UX/UI Designer
DevOps Engineer
Product Owner
Solution Architect

Following the Agile framework, our team worked in Scrum by splitting up the workload into two-week sprints.

Once release version 01 was complete, we launched the beta version with the client’s team giving them access to the portal so they could incorporate it into their work routine.

User Testing

After launching release version 01, RUBICON organized a set of user testing interviews. Taking our previous user testing experience into consideration, we subscribe to the theory that you only need 5 users in order to identify approximately 80% of all usability problems. You can read more about it here. The theory is that as you add more and more users, you learn less and less because you will keep seeing the same things again and again.

We performed user interviews with our candidates, one at a time. The interviews were done remotely, with the following team:

Interviewer — RUBICON
User testing candidate — client
Silent observers — the client and RUBICON

The interviewer would guide the candidate through the release version 01 with a series of essential questions (non-guiding, open-ended questions, which encourage the candidate to think critically). The candidate who had never seen the product before, logged into the system, went through the product and voiced their concerns and suggestions out loud. Users came from different departments, so we had input from all of the personas we defined at the beginning of the project.

As a result, we had a list of improvements and additional features that needed to be developed in order for the product to deliver more value to all users. We also knew that we were on the right path, as 80% of our candidates asked for features that were in development at the time.

After analyzing the feedback, we hosted another Lean Inception Workshop for release version 02, this time the workshop was organized remotely due to the COVID-19 pandemic. The purpose of this workshop was to extend the features and define the second project. We took the user testing results into consideration and defined our goals and features for the release version 02. With all of those results combined, our team started working on the second phase of development.

Once we implement the features for the release version 02, we will have a second wave of user testing, defining the improvements we need to make, organizing a new workshop, and starting the process all over again.

Data Lake Management Portal

Data Lake Management Portal is a unified and user-friendly web portal and service designed and developed for an unlimited number of users to have quick and easy access to data and sharing in the Data Lake. The portal enables users to browse, download, upload, request, share, and view available data within the company.

The portal contains the following features:

Functional Features

Users can connect to the portal using Azure AD Single sign-on
Search available data assets in the Data Lake
Support concept of application and workspaces which are needed for specific user’s data use cases
Request data from data owners
Approve/reject requests for data access
Web-based file explorer that enables users to upload, download, and create files and folders from the underlying Data lake
Expose portal features through API and secure connection via Azure AD service principal
Various event notifications: sending emails and app-to-user notifications
Add collaborators to the application workspace

Non-functional requirements

REST API documentation
Daily backups
System Security: Allowing access only to computer users, users can connect to the portal using Azure AD, other systems and APIs can connect to the portal via service principal
Automation with CI/CD pipelines, testing, deployment, and infrastructure
Testability — unit and integration tests
Usability — simple, clean and, usable UX/UI
Scalability — on-demand scalability
Audit logs — for all user access and actions
Code quality standards — clean code, static code analysis

The Cloud Solution

The client requested a scalable cloud solution with the following requirements:

To be run on Azure cloud
Easy long-term maintenance
Stability
On-demand scalability
To use an existing Azure Active Directory

The goal was to create a cloud native solution to support the requested features and requirements. The focus was to use Azure Platform as a Service (PaaS) offering lower effort to a minimum so the client wouldn’t have to maintain and manage resources such as storage, servers, applications, services etc.

Following industry-proven practices for scalable and performant cloud solutions, RUBICON implemented a solution shown in the diagram below:

This architecture shows a front-end single page application with JavaScript accessing backend REST APIs. The solution servers static content from Azure Blob Storage and implements back-end APIs with Azure App Services and Azure Functions. Users sign into the web application by using their Azure AD credentials and Azure AD returns access token, which the application uses to authenticate API requests. Furthermore, back-end APIs authenticate and invoke other company services and APIs by implementing OAuth 2.0 On-Behalf-Of authentication flow.

Architecture

The architecture includes the following components:

Blob Storage. Static web content, such as HTML, CSS, and JavaScript files, are stored in Azure Blob Storage and served to clients by using static website hosting. All dynamic interaction happens through JavaScript code making calls to the back-end APIs.
CDN. We used CDN to cache content for lower latency and faster delivery of content.
Azure AD. Users sign into the Data Lake Management Portal using their Azure AD credentials. Azure AD returns an access token for the backend API, which the Data Lake Management Portal uses to authenticate API requests.
App Service. Use App Service to build and host RESTful API. Backend API is responsible for calling downstream client’s APIs and uses OAuth 2.0 On-Behalf-Of flow to propagate delegated user identity and permissions. App service allows autoscale and high availability without having to manage infrastructure.
Queue. Use queues for background tasks, events, and messages by putting them onto an Azure Queue storage queue. The message triggers a function app for example.
Function App. Use the Function app to run background tasks, events, or messages being placed in the queue.
Data Storage. Use Azure SQL Database for relational data. For on-relational data, we are using Cosmos DB. For caching requirements use Azure Cache for Redis.
Azure DevOps Pipelines. They are used for continuous integration (CI) and continuous delivery (CD) that builds, tests, and deploys the solution.
Monitoring. Use readily available services for Azure PaaS offering. These include Application Insights, Azure Monitor, Log Analytics.

DevOps

For provisioning Azure resources we used Terraform and Terragrunt. Using these templates makes it easier to automate deployments and provision different environments in minutes, for example, to replicate load testing environments only when needed which saves costs.

Our DevOps team implemented continuous integration (CI) and continuous deployment (CD) pipelines using Azure DevOps Pipelines services that automatically build, test, and deploy every source code change. Azure DevOps provides Repos for source code control, Pipelines for CI/CD, Artifacts to host build artifacts, and Boards for developer collaboration and coordination. With these processes in place, we focus on the development of the applications rather than the management of the supporting infrastructure.

Frontend CI is programmed to trigger on every git push from feat, fix, chore branches. The Build Stage includes jobs and tasks that clone the repo, install npm tools, build the solution, run unit tests, and then package and publish artifacts to Azure Artifacts. If the source branch is being merged in, Build Pipeline triggers the Release Pipeline.

Backend CI/CD for Containers is one pipeline with the Build and Deploy stage. Build Stage installs .Net Core SDK, builds the solution, using Key Vault secrets runs Integration tests and publishes test results and Build Artifacts. Deploy stage gets skipped unless the source branch is develop or master. Deploy branch contains two tasks: building and pushing Docker image to Container Registry and a task to deploy the container image to Azure App Service for containers.

The initialization stage of the IaaC pipeline initializes a resource group with a storage account and a container that will be used as a remote state. In the Plan Stage, we are performing the validation and creating the plan for the infrastructure, but before we can do that we first need to download Terragrunt and the secrets file located in Azure DevOps. And finally, in the Apply Stage, after manual approval check, we are applying/deploying our changes to the infrastructure.

Results

Seven months later, we developed a secure web portal with a simple and modern UX/UI design that easily enables users to read and write files from an underlying data lake. We successfully helped the client say goodbye to their traditional business ways and transition into using their new software making their performance more efficient and automated.

Technology Stack

Applications & Data Microsoft Azure Cloud

Azure SQL Database
Azure CosmosDB
Azure Queue Storage
Azure Table Storage
Azure Blob Storage
Azure Functions
Azure Active Directory, Microsoft identity platform
Azure App Service
Azure Container Registry
Application Insights, Log Analytics

Frontend DevOps

Git
Azure DevOps: Boards, Repos, Pipelines
Docker
Terraform, Terragrunt

Project Screenshots

Information

Company: Multinational Chemical & Consumer Goods company
Region: Western Europe
Industry: Consumer Goods
Project Duration: 2019 — ongoing
Project: Cloud Native & Web Development

Originally published at https://www.rubicon-world.com.