Kafka Topic Naming

Erman Terciyanlı
inavitas
Published in
3 min readJan 31, 2022

Kafka is a distributed event streaming platform that lets you read, write, store, and process events (also called records or messages in the documentation) across many machines. These events are organized and stored in topics. Very simplified, a topic is similar to a folder in a filesystem, and the events are the files in that folder.

Hence, naming of a topic in Kafka is critical to keep the system consistent between all parties.

Overview of Kafka

Topic Naming Structure

In Kafka it is important to keep topics with a standard and there are several alternatives here. First thing that we should decide on topic naming is the semantic of the name. Naming structure could be similar to the following convention.

<field1>.<field2>.<field3>

In this structure first thing that we need to define is the meaning of fields and the rules that we should follow. It is better to keep the following rules in mind when deciding naming structure

  • Use field names that are not changing
  • Do not tie topic name to a specific application or to a service unless it is used internally
  • Define a format for the naming. Alternatives are camel case, kebab case or underscores.

In Inavitas, we are using the following structure and naming convention.

Structure is using three field types; domain, classification and description.

domain.classification.description.group

Let’s discuss what each part of the name means:

Domain

Domain is the main owner of the name and should be descriptive about topic. Some sample domain names are given below.

  • inv: For general/core purposes. Should be limited when topic is global in the whole system and there is no specific owner
  • access: All events related to user access
  • device: All events related to devices in the system

Classification

Classification in a Kafka topic gives us the type of the topic and all topics using the same classification should have similar data. Content could be different here but there should be a consistency for data formats.

  • fct: Fact data is an immutable information and it happens at a specific time. There should not be any information about change, deletion or command to other parties. Common example of fact data is data coming from devices.
  • cdc: Change data capture is being used to share an information about an instance. Topic should include the latest data of the instance and optionally we can add pre and post data when there is a change. C
  • cmd: Command topics are being used to send operations in the system. This is typically found as the request-response pattern. Sample is sending a write command to a device and then it is returning a response with the result of the command.
  • sys: System topics are used for internal topics that is being used in a single system or microservice. They are operational topics and do not contain any information outside of the owning system.

Description

Description is the part that is giving details of the event. It can be the name of the object for a cdc event, protocol name of the command, type of the data in a fact event or action type in a system topic.

Group

Group is an optional field and are being used to divide similar descriptions into groups. Grouping is a better option when we are producing message from same service but different consumers for different data types, e.g. industrial protocol drivers to collect data from devices.

Examples

Some sample topic names that are being used in our systems are given below.

inventory.cdc.warehouse : This topic is under inventory domain and contains messages about any changes for a warehouse instance.

device.cdc.device.modbus : This topic is under device domain and contains messages about changes in the device with a modbus communication. Producer of this topic is same for any other topics with device description but consumer is specific to the group of devices.

Conclusion

Keeping Kafka topic naming consistent is important and during the first design of the system names should be defined and shared with the development team. In order to be sure that all topic names are using the same structure, you can get topic names as environment variables that is being set from a CI pipeline. When it is set with an automated way, keeping auto creation setting as true will not cause any consistency problem.

--

--