Apache Kafka Guide #51 Kafka Connect: Standalone vs Distributed Mode

Paul Ravvich
Apache Kafka At the Gates of Mastery
2 min readMay 23, 2024

--

Apache Kafka Guide #51 Kafka Connect: Standalone vs Distributed Mode

Hi, this is Paul, and welcome to the #51 part of my Apache Kafka guide. Today we will discuss Kafka Connect and how working Standalone and Distributed Modes.

Kafka Connect Standalone Mode

So let’s begin with standalone mode, which is essentially a single process. This mode runs all your connectors and tasks within a single worker. Your configuration is included with your process, making it very straightforward to initiate. It’s extremely beneficial for the development and testing phases, particularly when you’re creating your own Kafka Connector. However, it lacks fault tolerance. If the process crashes or stops, your connector ceases to function. It also doesn’t support horizontal scaling. You can only enhance performance by upgrading to a more powerful CPU, but that’s about it. Additionally, monitoring is quite challenging due to it being a solitary, standalone process.

  • A single process runs your connectors and tasks.
  • Configuration is bundled with your process.
  • Very easy to get started with, and useful for development and testing.
  • Not fault tolerant, has no scalability, and is hard to monitor.

Kafka Connect Distributed Mode

You have several workers — essentially, their servers — that operate your connectors and tasks. The configuration isn’t included with the workers. Instead, it’s submitted via a REST API, and we’ll explain how to use this REST API in detail. Scaling up is straightforward: simply add more workers. By adding additional servers, these new workers automatically pick up and execute tasks. Additionally, the system is fault tolerant. If a worker fails, as we will discuss in the next class, all tasks are redistributed among the remaining workers, allowing your connectors to continue operating. This provides both fault tolerance and horizontal scalability, making the system highly effective and useful for the production deployment of connectors.

  • Multiple workers run your connectors and tasks.
  • Configuration is submitted using a REST API.
  • Easy to scale, and fault-tolerant (rebalancing in case a worker dies).
  • Useful for production deployment of connectors.

Conclusion

Standalone Mode is made for development and testing and Distributed Mode is made for production deployment of connectors.

Thank you for reading until the end. Before you go:

Paul Ravvich

--

--

Paul Ravvich
Apache Kafka At the Gates of Mastery

Software Engineer with over 10 years of XP. Join me for tips on Programming, System Design, and productivity in tech! New articles every Tuesday and Thursday!