Monolith to micro-services, reshaping FinTech with Confluent

Antonio Di Turi
DataReply
Published in
12 min readJan 31, 2024

Setting the Scene

In the dynamic world of FinTech, innovation is not just a buzzword; it’s the lifeline that drives companies ahead of the curve. This is a story about a startup that from now on we will call Y (X was already taken) that has made significant improvements in the realms of cryptocurrency exchange, commodities trading, securities, and Exchange-Traded Funds (ETFs).

Originally built on a monolithic architecture, Y rapidly ascended in the FinTech market, but with growth came new challenges and opportunities. As the limitations of their initial system architecture became limiting, they started looking for a future where agility, scalability, and innovation were everyday realities.

Thus, Y contacted Data Reply to begin an ambitious project to migrate from a monolithic to a micro-services architecture, a move that promised to revolutionize their data handling capabilities and set new standards.

Y aimed to develop a modern version of data architecture that would not only support their current operations, but also provide a robust framework for future growth and innovation.

Their journey, however, was paved with complexities typical of such technological transitions. The need to modernize their data platforms and pipelines was evident, and the challenges were multifaceted.

In the following sections, we will deep dive in the specific challenges Y encountered, the technical and business impacts of these challenges, and how Confluent’s solutions played a crucial role in the journey towards a more flexible, scalable, and efficient data architecture.

A creative representation of the stressed databases

Technical Pain Points — Navigating the Complexity of Architectural Transformation

The transition from a monolithic to a micro-services architecture presented a series of technical challenges that needed to be meticulously addressed to realize the desired business benefits.

Shared Database Access: A Bottleneck to Efficiency

The central MySQL RDS database had become a significant bottleneck. As the sole data repository for various teams and services, it led to significant issues in scalability and performance.

The database’s capacity to handle concurrent requests from different departments was being stretched to its limits, resulting in frequent downtimes and latency issues, particularly affecting applications requiring real-time access.

The Struggle for Performance in a Shared Environment

In a shared database environment, the startup faced the challenge of managing heavy and sometimes conflicting data queries. Without a way to efficiently allocate resources, the performance of critical applications was compromised. This situation was further exacerbated during peak trading times when rapid data processing was crucial.

Additionally, it led to point-to-point data pipelines where each team independently partially copied source database to a target system. The downside of this architecture was that building lineage of data consumption became close to impossible.

The Balancing Act: Access versus Control

Granting access to the central database also posed a significant risk. Every new access point potentially introduced performance degradation and security vulnerabilities. The startup needed a solution that could balance the need for data access with the need for control and security.

Data pipelines serving the database in a more organized way

Point-to-Point Data Pipelines — Paving the Way for Scalable Architecture

As Y grappled with the challenges of its monolithic architecture, the need for a scalable, efficient solution became clear.

The answer lay in implementing point-to-point data pipelines, which would fundamentally alter how data was managed, accessed, and utilized across the organization.

Transitioning to Asynchronous Communication

The adoption of point-to-point data pipelines marked a significant shift towards asynchronous communication.

This method allowed for the decoupling of services, enabling each to operate independently, such an approach was instrumental in improving overall system latency, as it eliminated the need for services to wait for responses from the central database.

Leveraging Confluent’s Data Streaming Technology

Confluent’s data streaming technology emerged as a key component in this architectural transformation. By implementing Confluent’s solutions, the startup could establish real-time data pipelines that connected various micro-services and applications directly to the data they needed, without the bottleneck of a centralized database.

More specifically, stream lineage offering helped the client to establish a solid foundation to govern data consumption.

This system provided a more robust and flexible infrastructure, where data could flow seamlessly between services, ensuring that each component had access to the most current and relevant data. It also allowed for a more granular level of control over data access and usage, aligning with the startup’s goal for improved data governance.

Integrating a Diverse Ecosystem with Confluent

In Y’s journey to build and define its new data architecture, the integration of a diverse array of systems with Confluent was critical.

Debezium: The Heartbeat of Change Data Capture

Debezium, an open-source distributed platform for change data capture, was at the forefront of this integration. Serving as the bridge between the startup’s databases and Kafka MSK clusters, Debezium enabled the real-time capture and streaming of database changes. This ensured that all microservices and applications had access to the latest data, essential for the fast-paced decision-making required in financial trading.

KsqlDB and Kstream: Streamlining Data Processing

ksqlDB, a database purpose-built for stream processing applications, played a vital role in processing the streams of data within the Kafka MSK ecosystem. It allowed to do some needed cleanup on the CDC topics so that they could be ready to consume in Confluent Cloud for the microservices that needed those data in input.

Microservices Architecture: A Synergistic Relationship with Kafka

The shift to a microservices architecture brought with it the need for a robust messaging system. Kafka, as a distributed event streaming platform, was perfectly suited to meet this need. It facilitated efficient communication between the various microservices, allowing for a decoupled, scalable, and resilient system architecture.

S3 and Snowflake Sinks: Facilitating Data Storage and Analysis

The integration of S3 and Snowflake sinks was another critical aspect of the system. S3 sinks enabled the efficient routing of data into Amazon S3 for storage and archival, while Snowflake sinks facilitated the seamless transfer of data to the Snowflake Data Warehouse for more complex analysis and reporting. This setup provided a versatile and scalable solution for managing the large volumes of financial data the startup dealt with daily.

A Comprehensive Data Ecosystem

Through these integrations, the startup established a comprehensive data ecosystem that was not only more efficient but also more adaptable to their rapidly evolving business needs. The use of Debezium, coupled with the power of Kafka, ksqlDB, and the strategic use of S3 and Snowflake sinks, created a robust infrastructure capable of handling the complexities of modern data processing.

Navigating the Transition from Monolith to Microservices

Transitioning from a monolithic architecture to a microservices-based system is like reengineering the DNA of a company’s IT infrastructure. For Y, this journey was both necessary and challenging. As we mentioned already (the Latin’s used to say “repetita iuvant” — literally repeating helps) the core of this challenge was to decouple consumer services and various internal applications from their central MySQL Relational Database Service (RDS), a critical move for enhancing scalability and flexibility.

Data Reply’s proposed architecture

A first view of the Architecture

The first hurdle in this transition was the technical complexity of setting up real-time data pipelines. Key to this was the implementation of Debezium for change data capture (CDC), which would enable the startup to stream database changes into Kafka in real-time. This was a two-step process:

1. From RDS to Kafka MSK through Debezium

2. From MSK to Confluent cloud after necessary cleanup of the topic using kstream and ksql

Another significant component was establishing S3 and Snowflake sinks within the Confluent ecosystem. You can imagine that 90% of the data was landing on s3 and some minor use cases were directly landing on Snowflake through a specific Kafka Connector.

These sinks were crucial for efficiently routing data into appropriate storage and analysis platforms (e.g. Jupyter notebooks hosted on Data Bricks), ensuring that data was not just moved but also made ready for immediate use. The focus was on leveraging Confluent’s robust capabilities to set up a snow pipeline, integrating S3 events, Simple Queue Service (SQS), and listeners to create a seamless flow of data.

Governance, Scalability and lineage: Redefining Data Access

With multiple services and teams needing access to various data streams, governance became a critical concern. Y needed a system that could solve critical questions like:

  • Who owns what data?
  • Who can access, discover, and use a data stream?

Confluent Cloud emerged as a key player in this aspect, offering robust governance capabilities. Its out-of-the-box data lineage feature was not just about managing connections but also about providing clear visibility and control over data flows.

The transition also raised questions about scalability and performance. The existing centralized RDS database was a bottleneck, limiting the ability to scale services independently. Teams experienced downtime and latency issues, as the database had to juggle diverse and sometimes conflicting demands from different departments. Applications requiring real-time database access were particularly affected, slowed down by heavy analytics requests.

Handling Significant Data Volumes and Throughput

In the fast-paced world of financial transactions, the startup dealt with substantial data volumes, characterized by peaks of 2000 transactions per second. This high throughput demanded a robust infrastructure capable of handling such intensive data flow without any lag or downtime.

The source databases, which contained multiple terabytes of financial data, were a testament to the sheer scale of information the startup managed daily. This data included a myriad of transaction records, market trends, and customer information, all critical to their operations

Evaluating Alternatives — The Decision for Confluent over Traditional Tools

In their pursuit of a robust data architecture, Y carefully evaluated various technologies before deciding on Confluent’s data streaming solutions.

Considering Traditional Messaging and Pipeline Tools

Initially, the startup explored several traditional messaging and data pipeline tools. One such tool was Apache Spark with its PySpark interface, commonly used for big data processing.

While Spark offered powerful batch processing capabilities, it fell short in providing the real-time processing speed essential for the startup’s high-frequency trading environment.

Another technology under consideration was AWS Glue, a serverless data integration service that simplifies ETL (Extract, Transform, Load) workloads. While AWS Glue was adept at handling snapshot mechanisms and batch processing, it lacked the capability for efficient change data capture (CDC), crucial for real-time data synchronization.

Comparing with Other Message Brokers

Other message brokers like RabbitMQ and Amazon Kinesis were also considered.

However, the ecosystem around Kafka, particularly in terms of maturity, community support, and integration capabilities, stood out as more advanced and complete.

Kafka’s robust performance in high-throughput environments and its scalability were key factors in its favor.

Why Confluent Emerged as the Optimal Choice

After a thorough evaluation, Confluent emerged as the clear winner for several reasons:

  • Real-Time Processing: Confluent’s Kafka-based platform excelled in handling real-time data streams, which was a game-changer for the startup’s need to process financial transactions instantaneously.
  • Scalability and Reliability: Kafka’s proven scalability and reliability were crucial for handling the high volumes of data generated in the financial sector.
  • Comprehensive Ecosystem: Confluent provided a rich ecosystem, including ksqlDB for stream processing, and connectors for integrating with various data sources and sinks, offering a one-stop solution for the startup’s diverse needs.
  • Advanced Governance and Monitoring: Confluent’s advanced governance tools and monitoring capabilities ensured better data management and compliance, a must-have in the heavily regulated financial industry.

Realizing Transformation — The Business Impact of Data Streaming with Confluent

The adoption of Confluent’s data streaming technology marked a significant turning point for the FinTech startup. This section explores the tangible impacts that data streaming had on the startup’s operations, highlighting the profound changes in efficiency, cost, and scalability.

Reduced Database Size and Instance Costs

One of the immediate benefits was the reduction in the size and cost of database instances. By moving away from a centralized database system and adopting data streaming, the startup significantly reduced the load on its primary database. This shift not only improved performance but also lowered the costs associated with maintaining large database instances.

Minimizing Production Downtime

The previous architecture’s frequent downtimes, caused by the overburdened central database, were dramatically reduced. CDC combined with stream lineage solution enabled a more distributed data handling approach, which effectively eliminated the single point of failure that the monolithic database represented. This improvement was crucial for maintaining continuous operations in the high stakes trading environment.

Additionally, it helped discovering redundant data loads (due to point-to-point pipelines) and eliminate them.

Speeding Up Data Processing and Analytics

The transformation also had a profound impact on the startup’s data processing capabilities. Daily data snapshots, which previously took a day to complete, were now accomplished in less than 10 minutes. The end-to-end latency from event capture to business dashboard availability was reduced to under 30 seconds, a significant enhancement over batch pipelines.

Operational Maintenance, Data Lineage and Visibility

By providing a single point of entry for all data sources, Confluent Cloud has significantly reduced the complexity that typically surrounds the management of varied data streams.

This streamlined approach not only eases the process of tracking, updating, and troubleshooting but also leads to a considerable reduction in time and resources expended on routine maintenance tasks. Furthermore, the unified platform enhances coordination among different teams, leading to more effective and efficient operational processes.

When it comes to data lineage and visibility, Confluent Cloud offers an unrivaled solution. Its capability to trace each dataset from its origin; through various transformations; to its ultimate destination offers a comprehensive view of the data journey. This visibility is vital in understanding how data is manipulated and utilized across different stages, providing valuable insights into the overall lifecycle of the data. Such clear data flow mapping is critical when identifying and resolving issues, ensuring the integrity and reliability of the data.

Improving data quality is another significant advantage offered by Confluent Cloud. With a detailed understanding of data origins and transformations, it becomes easier to identify and correct inconsistencies and errors. This improvement in data quality leads to more accurate analytics and decision-making processes, thereby enhancing the overall business outcomes.

Root cause analysis is also greatly enhanced by Confluent Cloud. The ability to track data back to its source enables teams to efficiently diagnose and understand the origins of various issues, whether they are technical glitches, data corruption, or process inefficiencies. This proactive approach to problem-solving helps minimize downtime and improves the reliability of systems.

Lastly, Confluent Cloud plays a crucial role in enhancing governance. The platform’s robust data lineage capabilities support stronger data governance practices. With improved visibility and control over data flows, organizations can more effectively enforce compliance with regulatory requirements and internal data policies.

Cost Reductions in Data Warehousing

By moving some of the pre-transformation ETL jobs from Snowflake to Kafka, the startup managed to reduce the consumption costs on the data warehousing side by 20%. This optimization resulted in direct cost savings and enhanced data warehousing operations performance.

A New Chapter in Data-Driven Innovation

As we reflect on the journey of this pioneering FinTech startup, it’s evident that their bold decision to transition from a monolithic to a microservices architecture, underpinned by Confluent’s data streaming technology, has not just resolved their immediate challenges but has set them on a path of continuous innovation and growth.

A Leap into the Future of Financial Technology

The startup’s transformation is proof of the power of strategic technological advancement in the financial sector. By embracing real-time data processing and analytics, the company has positioned itself at the forefront of the FinTech revolution. They have shown that with the right tools and vision, it is possible to turn data into one of the most valuable assets in the fast-paced world of financial trading.

Empowering Business with Data Streaming

Confluent’s role in this transformation was critical. It provided the technological backbone necessary for the startup to achieve its ambitious goals. The benefits realized — from reduced database costs and minimized downtimes to improved data processing speeds and operational efficiencies — underscore the impact of a well-executed data strategy.

A Blueprint for Success

This story serves as a blueprint for other companies facing similar challenges. It demonstrates that with the right approach and technology, such as that offered by Confluent, businesses can not only overcome their data architecture challenges but also unlock new potentials and opportunities.

Looking Ahead: Continuous Innovation and Growth

As the startup continues its journey, the scalability and flexibility of their new architecture ensure that they are well-prepared to adapt to future market demands and technological advancements. The successful integration of Confluent’s data streaming technology has not just solved immediate operational issues but has laid the foundation for a data-driven future.

In conclusion, this narrative is more than a success story; it is an inspiration and a guide for businesses looking to harness the power of data in the digital age. The FinTech startup’s journey with Confluent reminds us that in the world of technology, the willingness to innovate and adapt is the key to staying relevant and competitive.

If you want to get in contact, or have any feedback, feel free to drop me an email at a.dituri@reply.de or contact me on LinkedIn

--

--

Antonio Di Turi
DataReply

I am a big Data Engineer consultant 👨🏻‍💻 with a passion for teaching 👨🏻‍🏫. I ❤️ writing because it's the best way to really learn anything! 🦾