Distributed RPM Package Manager CI/CD deployment for Real-Time Recharge Payments exceeding a network of 500+ servers

Published in

Airtel Digital

3 min readJan 15, 2024

Introduction

Real-Time Recharge Payment, tasked with managing software deployments across a vast network of 500+ servers, demanded an elegant solution for distributed RPM package delivery.

Real-Time Recharge Payment is an application with a robust and scalable solution for managing application software packages across numerous Red Hat-based Linux systems, as a distributed RPM package manager designed for handling extensive server deployments, exceeding 500 servers

Real-Time Recharge Payment has distributed RPM package management capabilities to cater to the demands of large-scale server deployments, exceeding 500 servers

Distributed Architecture framework where we have achieved HA module for the core application that should be available 24/7 hence we have designed our pipeline so that other applications should be running all the time as this application was catering customers recharge fulfilment for 120 million customers and transaction management to Ensure atomicity and consistency of recharge transactions across instances.

Background

There are several benefits of using a distributed RPM package manager. First, it can help to save time and effort, as you can manage software packages on all of your servers from a single location. Second, it can help to improve consistency, as you can ensure that all of your servers are running the same versions of the same software packages. Third, it can help to improve security, as you can centrally apply security patches to all of your servers.

Who should read this document?

RPM Distributed package managers can be a valuable tool for organisations that need to manage software package RPM deployment on multiple servers.

Design

Scalability: Real-Time Recharge Payment distributed architecture effortlessly adapts to accommodate expanding server deployments, exceeding 500 servers.

Reduced Costs and Complexity: Streamline software distribution and maintenance, minimising manual intervention and saving time.

Large Enterprises: Efficiently manage software across data centers and geographically dispersed server deployments.

Compatibility: Ensure the chosen package manager is compatible with your Red Hat-based Linux distribution.

State Synchronisation: Maintain consistency of data and user sessions across instances for a seamless user experience.

Validation: Thoroughly test the HA setup to ensure it functions as expected under various failure scenarios.

Failover: Automatically switch to healthy instances when a failure is detected, ensuring seamless service continuity.

Features and Functionality: Evaluate the specific features and functionalities offered by different package managers to align with your specific needs.

Deployment Principles

Version Control: Track package versions and rollback if necessary.

Nexus Repository Manager: Stores packages securely maintains version history and organises and manages software artifacts, including RPM packages, App Config, and regularly back up data to enable recovery in case of disaster. Package metadata includes dependencies, descriptions, and version information.

YUM Repositories: Centralised locations for storing RPM packages, allowing easy access and updates. YUM repositories gather all your needed software in one place with version control. Client systems are configured to access the Nexus YUM repository.

Deployment to Staging Environment: Incremental automated deployments push updates to a staging environment for initial testing and validation.

Production Rollout: Upon successful validation, it automatically deploys the package updates to your production servers.

Deployment Validation: Thoroughly test the HA setup to ensure it functions as expected under multiple servers, and verify the deployed service status on all nodes.

Observability & Logging

Collect and analyse system metrics, logs, and application performance data, which helps to identify potential issues, troubleshoot failures, and optimise performance. Continuously monitor and fine-tune to meet evolving demands and maintain optimal performance and reliability in ELK and Grafana.