FCC Design pattern for Lambda Architecture

Ananth Durai
Jul 23, 2017 · 3 min read

We have discussed Lambda architecture extensively in my previous blog post. The literacy around Lambda Architecture is growing, and it got adopted in many production applications.

The Lambda Architecture is growing in adoption, so as the criticism around it as well. As Jay Kreps rightly pointed out in on of his blog post on “Questioning the Lambda Architecture,” one of the biggest pain points in adopting Lambda Architecture is how to maintain business logic in both batch layer and speed layer.

Speed, Volume, and Correctness tradeoffs:

Similar to CAP theorem, there is a trade-off among the Speed, Volume & Correctness in data processing (Let’s call it as the SVC theory). Among Speed, Volume and Correctness. A stream processing engine required to access full volume of data to achieve the correctness there by compromising speed. Some good approximation algorithms help stream processing engines to achieve speed but compromising on correctness.

Lambda & SVC pattern:

The SVC principle makes adopting Lambda Architecture even more complicated since Speed layer chooses speed over correctness where batch layer wants accuracy over speed. The different characteristics of the system often lead to an intricate design that increases the cost of the system.

FCC pattern:

The challenges in implementing Lambda Architecture is inevitable, but as like other system components, we could follow some simple design patterns to handle the complexity better. The FCC patterns are a set of software design philosophies to design Lambda Architecture better.

Fork:

In Fork pattern, both speed layer, and batch layer follow a unique code path. Fork design is mostly a shared nothing code path; both speed layer and the batch layer has its set of test coverage and evolves independently.

Pros:

The advantage of Fork pattern is, the shared nothing architecture enable maintaining the sanity of both the code base since each can follow independent structure.
Cons:

The Fork pattern is a good approach if it built once and forgot forever. But as business requirement changes, it’s super hard to evolve both the code path in parallel to solve a single problem.

Clone:

The Clone pattern, as the name suggests clone the code path on both batch layer and real-time layer. It sounds scary, but often a system started with a batch capability and evolved into adopting real-time capability or vice versa. A standard pattern in a data product is that it starts with a higher SLA period to solve a business problem and as the market adoption grows SLA shrinks.
Pros:
It provides greater agility as It’s often less effort to copy over the business logic.
The clone pattern provides better reliability as we can reuse the test cases.
Cons:
The batch and real-time often use different tools and programming languages. Cloning one code path to another language may not fit with the coding standards.

Contract:

The contract pattern is an ideal solution, where we define a clear business interface (contract) and both real-time and batch layer implements it.
Pros:
Clean code and design pattern based approach
Both the batch and real-time code path can evolve independently
Since both the code paths are behind an interface, it’s often effective to develop test cases against the interfaces.
Cons:
Contract pattern may not well fit for the legacy applications or if both the code path developed in different languages that won’t work well together.

allthingsdata

All things big data

Ananth Durai

Written by

I break things for living.

allthingsdata

All things big data

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade