Apache Kafka: A Comprehensive Guide with PHP Examples

OmidReza Salari
3 min readJul 1, 2024

--

Apache Kafka, an open-source stream-processing software platform, has revolutionized how real-time data is managed and processed. Originally developed by LinkedIn and now a part of the Apache Software Foundation, Kafka is designed to handle high throughput, fault tolerance, and horizontal scalability. In this article, we’ll dive deep into Kafka’s architecture, its core concepts, and demonstrate how to integrate Kafka with PHP.

Table of Contents

  1. Introduction to Kafka
  2. Core Concepts of Kafka
  3. Kafka Architecture
  4. Installing Kafka
  5. Setting Up Kafka with PHP
  6. Producing Messages to Kafka in PHP
  7. Consuming Messages from Kafka in PHP
  8. Conclusion

1. Introduction to Kafka

Apache Kafka is a distributed event streaming platform capable of handling trillions of events a day. It is used for building real-time data pipelines and streaming applications. Kafka’s robust architecture makes it suitable for a variety of use cases, including log aggregation, stream processing, event sourcing, and real-time analytics.

2. Core Concepts of Kafka

Before diving into the technical setup, let’s understand some core concepts:

  • Producer: An application that sends messages to Kafka.
  • Consumer: An application that reads messages from Kafka.
  • Topic: A category or feed name to which records are stored and published.
  • Partition: A topic is divided into partitions to allow for parallel processing.
  • Broker: A Kafka server that stores data and serves clients.
  • Consumer Group: A group of consumers that work together to consume messages from a topic.

3. Kafka Architecture

Kafka’s architecture consists of the following key components:

  • Kafka Cluster: A cluster of multiple brokers.
  • ZooKeeper: Used to manage and coordinate the Kafka brokers.
  • Producers and Consumers: Applications that produce and consume messages, respectively.

Each topic in Kafka is divided into partitions, and each partition is replicated across multiple brokers for fault tolerance. This ensures that Kafka can handle large volumes of data with high reliability and availability.

4. Installing Kafka

To get started with Kafka, you need to install Kafka and ZooKeeper. Here’s a brief guide:

Step 1: Download Kafka

Download the latest version of Kafka from the official Apache Kafka website.

Step 2: Extract and Configure Kafka

tar -xzf kafka_2.13-2.8.0.tgz
cd kafka_2.13-2.8.0

Step 3: Start ZooKeeper

Kafka requires ZooKeeper to be running. Start ZooKeeper using the following command:

bin/zookeeper-server-start.sh config/zookeeper.properties

Step 4: Start Kafka Broker

Start the Kafka broker using the following command:

bin/kafka-server-start.sh config/server.properties

5. Setting Up Kafka with PHP

To integrate Kafka with PHP, we’ll use the php-rdkafka extension, which is a PHP binding for librdkafka.

Step 1: Install librdkafka

Before installing the PHP extension, you need to install librdkafka. On Ubuntu, you can use the following command:

sudo apt-get install librdkafka-dev

Step 2: Install php-rdkafka

Install the php-rdkafka extension using PECL:

pecl install rdkafka

Add the extension to your php.ini file:

extension=rdkafka.so

6. Producing Messages to Kafka in PHP

Here’s a simple example of a PHP producer that sends messages to a Kafka topic:

<?php
$conf = new RdKafka\Conf();
$conf->set('metadata.broker.list', 'localhost:9092');

$producer = new RdKafka\Producer($conf);
$topic = $producer->newTopic('test_topic');

for ($i = 0; $i < 10; $i++) {
$topic->produce(RD_KAFKA_PARTITION_UA, 0, "Message $i");
$producer->poll(0);
}

while ($producer->getOutQLen() > 0) {
$producer->poll(50);
}

echo "Messages produced successfully!";
?>

This script configures a producer, sets the broker list, and sends ten messages to the test_topic topic.

7. Consuming Messages from Kafka in PHP

Here’s how you can consume messages from a Kafka topic using PHP:

<?php
$conf = new RdKafka\Conf();
$conf->set('group.id', 'myConsumerGroup');

$consumer = new RdKafka\KafkaConsumer($conf);
$consumer->subscribe(['test_topic']);

echo "Waiting for messages...\n";

while (true) {
$message = $consumer->consume(120*1000);
switch ($message->err) {
case RD_KAFKA_RESP_ERR_NO_ERROR:
echo "Received message: " . $message->payload . "\n";
break;
case RD_KAFKA_RESP_ERR__PARTITION_EOF:
echo "No more messages; will wait for more\n";
break;
case RD_KAFKA_RESP_ERR__TIMED_OUT:
echo "Timed out\n";
break;
default:
throw new \Exception($message->errstr(), $message->err);
break;
}
}
?>

This consumer script subscribes to the test_topic topic and continuously listens for new messages, processing them as they arrive.

8. Conclusion

Apache Kafka is a powerful tool for real-time data streaming and processing. Integrating Kafka with PHP is straightforward with the php-rdkafka extension, allowing PHP applications to produce and consume messages efficiently. With its scalable and fault-tolerant architecture, Kafka is an excellent choice for building robust data pipelines and streaming applications.

Whether you’re dealing with log aggregation, real-time analytics, or event sourcing, Kafka’s capabilities can significantly enhance your application’s performance and reliability. By following the examples and guidelines provided in this article, you can start leveraging the power of Kafka in your PHP projects today.

--

--