System Design lessons learned from Apache Kafka

Bhagwati Malav
Hash#Include
Published in
6 min readSep 22, 2018

This article would focus on various design concepts eg: horizontal scaling, vertical scaling, data sharding, availability, fault tolerance, consistency, cap theorem etc. Whenever you are asked to solve a design problem, you need to come up with solution which can satisfy given requirement eg: high availability, consistency, scale up to given user base etc.There is no best design, we would always design a system based on multiple factors which are critical to us eg: one might want to have a system built which can maintain consistency or availability.

I am sure you would have heard about CAP theorem so lets talk about that. In modern distributed system you can achieve only two out of C-Consistency, Availability, P-partition tolerance as You always need to have partition tolerance in modern distributed systems. Now you always need to choose any one between availability, consistency. So cassandra, couchedb follow P-A, and hbase, cache follow P-C. I would recommend you to read case study of these using CAP theorem. If you want to build a system with high availability, can be done by running multiple replicas. And if we talk about scalability, there are two ways, the first one is vertical scaling where you add resources to single node eg: cpu, memory etc. But you can not add more resources after certain limit. So this is not very great way to scale. The second one is, horizontal scaling, where you have load balance running at front, and you have multiple node running in behind. There are multiple ways to distribute load here. So if application getting huge traffic, it can easily be scaled adding more nodes horizontally. You would not have any restriction in second approach.

Lets talk about data store side, while designing a system you always have multiple choices in terms of choosing data store eg: postgres, mongodb, cassandra, mysql etc. As we discussed this also depends on given system requirement eg: high write, high reads, schema conformity, multiple entity with joins vs single self contained document etc. There are multiple factors which need to be considered while choosing data store. If you designing a social networking, blogging system where you will have user’s post, other can comment, like etc. If you choose any rational database you will have to perform multiple joins to get required data. So designing a self contained document would help here. There are multiple models in db design, will discuss two models eg: master-slave(multiple) eg: mongodb, master-multi-slave eg: cassandra etc. The first one allows you to do write on primary node, and reads can be performed on remaining nodes. So any updates, writes gets executed on master/primary node, and others nodes follows transaction/operation log, and do the same operation. This way other nodes achieve synchronisation to master. The second one, allows you to write/update on 2 or more nodes, and read from remaining nodes. So it helps in getting faster write to the system. And any time if primary node goes down, there is an election algorithm which starts election and choose primary node so that db keeps running properly without any failure. There is concept called data sharding in database. Lets say you need to design a data store which can store huge amount of data. If you do it with single collection/table on single node. It wont perform well. In such scenarios, you need to split data, and need to store it on multiple nodes. There are multiple ways to shard data. I would recommend you to go through various sharding strategies eg : hashing, lookup tables, routing etc.

Now lets talk about designing a distributed messaging queue which can allows faster write, high availability, fault tolerance. It should be able to perform well where multiple publishers writing to it, and need to process these events very faster, so overall throughput need to be better. Let take this as a challenge, and try to come up with a basic solution before you read further. You can think of having a queue data structure on single node, and there are two pointers of it. The first one, current_read, you are reading messages from this, and the second one, where you are performing write operation. Now think of multiple publisher publishing messages, so it would not be able to write faster as it would get locked when one thread writing to it, this will be extremely slow in terms of processing incoming event streams, and if you cant do faster write you can’t do faster reads. So need to think of something else. What if you have multiple queue on single node, and you are sending events to these queues simultaneously, and would not get blocked here. Lets say you need to process millions of events very faster, you can’t do it on single node. So we can setup these queues on multiple nodes, and allow write simultaneously. And can write to these in round-robin/hash based strategy, so single queue node wont get overloaded. And if you want to maintain availability, you need to run replicas as well. And if any node goes down, it should be able to elect new leader out of given replica nodes. So here writes should happen to leader queue node. And replicas would be syncing to leader. So you are good in terms of faster write, availability, fault tolerance. Now if you run multiple consumers, can process these individual queue nodes, and process events to get better throughput. This solution would be somewhere close to kafka.

source: Cloudurable

Now lets understand kafka. Before that lets understand few keywords eg: topic, broker, partition, consumer, publisher, consumer group, cluster.

Topic : You can have multiple topics in given application. Eg: in ecommerce application order events for order data, active products events for new active products, out of stock events for the product which are not available. So we can have 3 different queues/topics here to process given data.

Partition : As discussed earlier, incoming stream can be huge, so we split it, and store on multiple nodes. So kafka allows us to setup multiple partition for given topic so that concurrent write happens faster. And also runs replicas for each partition. There is one leader, and remaining slaves. Kafka writes incoming events to the leader partition, and other partitions sync to leader, once it gets completed, kafka send acknowledgement to the publisher. Publisher follow round robin strategy to send events to various partition by feault. You can also configure hase key based distribution, where event with given hashkey always goes to same partition. And we get events ordering at partition level, not at topic level.

Broker : Broker is a kafka server, which can have multiple partition running on it. And kafka also runs replicas of broker so that if any broker goes down, kafka still keeps running without any failure of data.

Cluster : Kafka runs multiple borkers in kafka cluster.

Consumer group : There can be multiple consumers in given consumer group. And they would be reading from multiple partitions for given topic. But all the given event gets processed by one one consumer in same consumer group whereas if we have multiple consumer groups configured for given topic, then each event gets processed by both consumer groups. This allows us to achieve many critical use cases in real time application eg: Lets say you have order streams of the products which we get from customer, now we can run two consumer group for this, the first one process these order, and the second one does analytics on order data. You need to make sure that kafka runs with multiple partitions to achieve better parallelism.

So thats how kafka is one of best distributed messaging application. And i truly feel if one try understanding kafka, can understand many basic design concepts.Thank you so much for reading my post. I would keep writing about system design, scalability, event driven architecture, microservices in future.

  1. CAP Theorem : http://ksat.me/a-plain-english-introduction-to-cap-theorem/
  2. CAP Theorem Mongodb : https://www.annashipman.co.uk/jfdi/the-cap-theorem-and-mongodb.html
  3. System Design : https://www.hiredintech.com/courses/system-design
  4. System Design : https://www.interviewbit.com/courses/system-design/
  5. Scalability Blog : http://highscalability.com/all-time-favorites/
  6. Kafka : http://cloudurable.com/blog/kafka-tutorial/index.html
  7. Kafka : https://www.youtube.com/watch?v=_RgUxUTuxH4
  8. Event Driven Architecture by Martin Fowler : https://www.youtube.com/watch?v=STKCRSUsyP0
  9. Distributed System : https://www.youtube.com/watch?v=tpspO9K28PM

--

--