Re-architecting data pipelines to the cloud
By Rajesh Dongre, Lead Data Engineer, and team
Rearchitecting data for sustainability — the opportunity
As part of Macquarie Group’s digitalisation agenda, the Banking and Financial Services (BFS) division simplified their data pipeline by renovating the solution on to cloud and decommissioning the legacy platforms, all within a span of two years.
Re-architecting established data pipelines, especially in data warehouses, can be complex and difficult. Here, as part of the data team in BFS, we share our experience and insights from the journey.
We decided to begin by simplifying our data pipelines across our three warehouses. This was achieved by reducing the number of data hops, simplifying complex logic and using a standard approach to store and access data. This simplification enabled us to migrate and decommission two of our on-premises data warehouses within 12 months, before taking on the largest of the three.
Decommissioning one data warehouse and switching over to another is like upgrading a city’s central water purification and distribution system — without disrupting supply to its population. It is one of the hardest tech programs to undertake. They have a huge number of upstream and downstream integrations, with a significant number of use cases running on them.
Our largest data warehouse was the primary data platform servicing the division’s reporting needs. The warehouse was a complex solution with multiple data stores that were dependent on each other, with thousands of ‘extract, transform and load’ (ETL) jobs populating these data stores. These data stores served multiple reporting requirements accounting for 300+ interfaces and 500+ reports.
In terms of technical complexity, the numbers were mind-boggling. After all, the legacy warehouse had evolved over a decade and a few generations of engineering teams had enhanced it with different coding standards.
Suffice to say, we knew the transformation would be a challenge. However, we also knew that with the right planning and persistence, we would achieve this milestone.
We had to face into it because the benefits were going to be huge.
Fast-forward two years, and we would see:
The great untangle
The key thing with decommissioning, as with anything of this size and complexity, is knowing where to start.
We established a team of lead data engineers, product owners and transformation leads with expertise in technology and data, stakeholder management and large deliveries.
Our first challenge was to untangle the complex data pipelines and understand the dependencies. Instead of the typical top-down approach, we adopted a two-step approach. First, we reviewed the data estate to understand the interfaces, their dependencies and different patterns. This helped us to gauge the size of the challenge and identify the templates for automation. In the second step, we did a detailed refinement, from the bottom up. This allowed us to understand the data landscape much better.
This analysis gave us our first ‘light bulb’ moment that the components used in the processes were repeatable — they could be standardised and modularised, enabling automation for the repeatable processes.
Our second challenge was to ensure correct estimation of the project. We explored two approaches to estimation — the traditional data migration estimation technique and the t-shirt sizing technique.
We found that the program of work could be delivered on time using the latter approach. T-shirt sizing is an Agile estimation technique that helps predict workloads and sets realistic deadlines based on patterns and interfaces. The estimation approach brought the milestone dates within plausible timeframes.
Guiding principles of automation, scalability and re-usability
We defined guiding principles for the strategic data and engineering architecture and standardised the data flow and consumption patterns. A pattern-based strategy was employed to implement the new solution. We also created a thorough physical implementation design which included coding and data modelling standards that established a robust approach.
We adopted the smart approach of ‘compartmentalisation’ This meant that each module could be delivered end-to-end, from build to go-live, in a safe, isolated user space. Every component of the data flow could be containerised as a unit of work and built independently. This made each feature team self-sufficient and self-driven.
We made sure that new code was configuration-driven. This would enable lines of code to be developed and reused to deliver value across other processes in the same ecosystem. Several frameworks and automations were developed to improve engineering productivity significantly. An automated regression testing application was created. The Continuous Integration and Continuous Deployment (CI/CD) pipelines were uplifted to accelerate environment readiness. We also organised mob programming and mob modelling sessions to solve pressing problems quickly and cross-train the team.
Next, it was time to roll up our sleeves and start building the new solution. Each feature team owned an interface and built components based on the strategic data architecture principles, working in an agile way. Compartmentalisation, automation and configurable frameworks enabled multiple teams to work effectively and independently. The components were built under the guidance of the engineering team, integrated to establish the data pipeline and thoroughly tested using a battery of automated test cases and parallel runs. The tested code was then shipped to production using the uplifted CI/CD pipeline.
The team also designed an advanced, highly optimised and automated process that was more than 100 times faster than the default manual process. This was achieved by approaching the data at source through a smart block-level split in Oracle, repartitioning and reformatting the data at scale and finally consolidating the files in Amazon Simple Storage Service (Amazon S3), a file storage system on AWS — a big win for innovation.
Batch processes were orchestrated to ensure the legacy and new cloud platform ran in parallel, smoothly. The consumption pipelines were redirected to consume the output from the new solution, enabling a decrease in the use of the on-premises solution, while gradually switching over to the new platform on the cloud.
Yes, we faced last-minute surprises as well. For example, we discovered a few anomalous processes that had manifested. We also exposed bugs in the code that previous test runs had not caught. We arranged hackathons that brought teams together to solve problems in ‘war room’ type situations. This focused collaboration delivered efficient and effective solutions.
The work we did to decommission the data warehouse allowed us the opportunity to simplify several processes. We streamlined operational data processes and ensured ownership was assigned to appropriate teams. It also enabled us to significantly improve our analytics, controls and performance while reducing copies of data and idle processes
After the transformation, the data pipelines were simpler and strategically designed to enable future development.
Outcomes of persistence
The work we did was complex, exciting and formed a steppingstone towards enabling more trust in data, better insights and a strengthened culture of at-scale problem-solving and data-led decision-making. The cascade of our efforts meant that we could:
- Design and implement scalable solutions on cloud-agnostic platforms and leverage best-of-breed technology with a flexible architecture
- Effectively manage costs and enable growth
- Leverage data to enable automation of internal processes
- Deliver deeper insights
- Anticipate and respond to increasing customer and regulatory expectations
Setting ourselves up for success
The success of this program was underpinned by people who were collaborative, deeply customer-aligned with a strong risk mindset and proud of the work being done. Our key success factors included a:
High-performing team culture
- Diversity and inclusion: We consciously brought a team of diverse people together to bring a range of experience, perspectives, ideas and insights to the program. We created an environment where people could express their opinions freely. Teams felt psychologically safe to raise their concerns and deliver constructive feedback
- Trust your team: When you engage with the people who are close to the action, they are well-positioned to present you with an accurate image.
- Focus on wellbeing: Throughout the incredibly challenging program, we prioritised mental wellbeing, family time, and non-work-related social connections — trivia quiz, Friday virtual get togethers, wellness challenges and virtual fitness sessions.
Executive engagement and support
- Leadership engagement: Having an open-door culture and flat structure helped enable quick decisions and progress. We are agile top-to-bottom, and in practice — our Best Agile Place to Work Award is a testament to this culture.
- Executive support: We had the consistent support of executive leaders, who encouraged us throughout the journey, held us to account and ensured we reached our destination on time.
Test-and-learn solution approach
- Fail fast: Do not be afraid to make mistakes. Learn from them. To quote Sir Richard Branson, “Do not be embarrassed by your failures; learn from them and start again.”
This enabled us to conduct effective Blameless Post-mortems (BPMs) and pivot quickly.
- Solve once: Operate smartly. Always attempt to fix the problem permanently and implement reusable, modular solutions. Identify bottlenecks early in the process and alleviate them.
- Focus on the ‘big rocks’: Invest in a team of diverse expertise. Follow the 80–20 principles. Focus on the rocks and boulders, not on stones and pebbles.
- Be flexible: A sustainable risk mindset is important — be open to calibrating decisions as situations evolve.
We did it!
This was an extraordinary year and a half for many of us. It felt good to work arm-in-arm with a committed group of people, all aligned to a single goal — and with the same enthusiasm. But nothing can describe what it felt like when this team became a like family, where everyone stands for each other — sharing challenges and moments of glory together.
It’s true, large data warehouses are complex and difficult to transform — and we did it! With painstaking planning, diligent design, top-notch team collaboration and formidable fortitude, you can do it too.