Introduction to Topic Log Compaction in Apache Kafka
Apache Kafka can be used to solve variety of problems. Most of the systems which use Kafka are distributed and involve real time data processing of large scale of messages. Think of a problem that you can solve using big data and now ask yourself a question “How will this solution be affected if the scale grows by the factor of 100000?” This question always bring me to the conclusion that the producer will keep on pumping the messages in and eventually the disk will run out of space to store messages.