15 min readMay 25, 2023

Experience Sharing | Apache EventMesh Community — Liu Wei: Broad Knowledge and Selective Application, Accumulation and Eruption

The project experience sharing this time comes from student Liu Wei, who undertook the project “Apache EventMesh supports Dledger pluggable event storage”.

About Apache EventMesh open source community

Project introduction
Apache EventMesh (Incubating) is a dynamic cloud-native event-driven architecture infrastructure for decoupling applications and backend middleware layers. It supports a wide range of use cases, including complex hybrid clouds and distributed architectures with different technology stacks.

Project basic information

Project name: Apache EventMesh supports Dledger pluggable event storage

Project number: 229020158

Project link: https://summer-ospp.ac.cn/#/org/prodetail/229020158

Project mentor: Chen Guangsheng

Project description
Currently, Apache EventMesh supports RocketMQ as the event storage and Standalone mode for memory storage. The goal is to expand the event storage component Dledger that EventMesh can support based on the existing API. The integration of Dledger event storage can improve the community’s expansion capabilities in the event storage field and explore more application scenarios.

The main content of this topic includes:

1. Implement EventMesh to interface with Dledger for event storage pub/sub
2. Have a certain understanding of Dledger storage implementation and related concepts, understand Dledger’s design philosophy and related application scenarios
3. Support event storage Dledger event pub/sub capability based on cloudevents protocol
4. Add related test cases
5. Improve related usage documentation guidance

Project development process

Design scheme introduction

Development process overview

The original goal of the project was to develop the EventMesh DLedger Connector plugin. After research and exploratory development, it was found that DLedger could not meet the requirements of the EventMesh Connector module very well. Then, the feasibility of BookKeeper, with similar positioning but more mature, was researched as the Connector, but it still could not meet the requirements. Finally, the project was determined to design and develop the Pravega Connector plugin based on the Pravega stream storage system.

EventMesh DLedger Connector research and exploratory development DLedger Introduction

DLedger is a Java library based on the Raft protocol, which can be used to build commitlogs with high availability, high durability, and strong consistency. It can be used as a persistence layer for distributed storage systems, including messages, streams, KV, DB, etc. DLedger is an append-only storage, and its logical storage structure has two layers: Commitlog and DLedger Server. Commitlog is a continuously growing data record, and DLedger Server maintains a set of Commitlogs. In addition, every 3 DLedger Servers can form a group to maintain a set of Commitlogs together, with automatic leader election within the group based on the Raft protocol, ensuring data consistency.

DLedgerClient Introduction

DLedgerClient provides two simple methods to read and write data in the DLedger server: append(byte[] body): This method appends byte array data to the DLedger server and returns its index value; get(long index): This method retrieves byte array data from the DLedger server according to the given index value.

RocketMQ’s file storage and its DLedger cluster mode

RocketMQ uses three types of files to store messages, the first two of which are also append-only storage. They are detailed below.

CommitLog: Message bodies and metadata are stored sequentially in fixed-size files.

ConsumeQueue: Fixed-size files store the offsets, message lengths, and HashCodes of message tags in CommitLog for each Topic.

IndexFile: Provides a method for querying messages by Key or time range.

DLedger’s capabilities correspond to RocketMQ’s CommitLog. Of course, RocketMQ already has an implementation that replaces CommitLog files with DLedger (DLedgerCommitLog class). Its purpose is to implement automatic leader election between nodes based on the Raft protocol and ensure Commitlog consistency in RocketMQ cluster mode, replacing the original primary-slave cluster solution. In addition, it can be noted that ConsumeQueue is the key to implementing an effective publish-subscribe mechanism. The DLedger Connector can refer to RocketMQ’s ConsumeQueue implementation to store subscription metadata grouped by Topic in local files.

EventMesh DLedger Connector Design Plan Overview

Scheme 1: External DLedger Server

Place the DLedger Server externally, running as an independent process outside the application process. Since DLedgerClient only provides append and get methods, if the DLedger Server is placed externally and accessed using the client, the basic storage scheme is as follows:

1. Data storage: Use one DLedger group to store all Topics, and get an increasing index after each write of a Message to a specific Topic.

2. Metadata storage: It is necessary to record each Topic and its corresponding index, which is approximately a Map<String, Queue> type data structure. If DLedger wants to store messages from different Topics separately, it can only have one Topic corresponding to a group of DLedger Servers, which means that each time a new Topic appears, a new group of DLedger Servers needs to be started.

3. Existing issues: Map<String, Queue> is stored in memory, and it will be lost once the system crashes. Further consideration is needed on how to persist it.

Scheme 2: Built-in DLedger Server

Referring to RocketMQ’s DLedgerCommitLog, the DLedger Server is built-in as one of the application process threads. In this way, the DLedgerStore can be directly operated, bringing more interfaces, but the interfaces provided by DLedgerStore are also simple. Compared to append and get, there are a few more functions such as getLedgerEndIndex, and getCommittedPos. Thus, implementing the publish-subscribe mechanism is still required.

EventMesh DLedger Connector’s exploratory implementation

The exploratory implementation was carried out using Scheme 1 (external DLedger Server): https://github.com/apache/incubator-eventmesh/pull/1061. This scheme can complete basic functions in a standalone situation but lacks sufficient testing.

EventMesh BookKeeper Connector research and design

Apache BookKeeper Introduction

BookKeeper is a scalable, fault-tolerant, low-latency storage service optimized for real-time workloads. BookKeeper is also an append-only storage system with a three-layer logical storage structure: Entry, Ledger, and Bookie. An Entry is a record, a Ledger is a collection of Entries, and a Bookie is a collection of Ledgers as well as an independent BookKeeper server.

BookKeeper provides the Ledger API and Advanced Ledger API. The former allows for basic reading and writing of Ledgers, specifying the ledgerId but not the entryId, while the latter allows specifying both ledgerId and entryId.

Apache Pulsar Introduction

Pulsar is a cloud-native, multi-tenant, high-performance message queue solution built on the publish-subscribe (pub-sub) pattern. Pulsar relies on ZooKeeper and BookKeeper for its operation. ZooKeeper is a highly reliable distributed coordination system that provides configuration center, name service, cluster management, and other services. It is responsible for various configuration-related and coordination-related tasks in Pulsar. BookKeeper is a scalable, fault-tolerant, low-latency storage service optimized for real-time workloads, responsible for the persistent storage of message data in Pulsar. Its architecture is shown in the following diagram:

Since BookKeeper has a three-layer logical storage structure, Pulsar can naturally map one Topic to one Ledger (of course, to handle chunked messages, the actual situation is that one Topic corresponds to an encapsulated managed-ledgers, which contains multiple Ledgers). Besides, only one group of Bookie needs to be started.

EventMesh BookKeeper Connector Design

BookKeeper is also an append-only storage system. Compared to DLedger, BookKeeper has a three-layer logical storage structure, so in the publish-subscribe scenario, it can naturally map a Topic to a BookKeeper Ledger. The publish-subscribe metadata can be stored in a fixed Ledger. The details are as follows:

One Topic corresponds to one Ledger, which means that messages belonging to the same Topic are stored in a single Ledger. The publish-subscribe metadata (such as consumption positions of each Topic) is fixedly stored in the Ledger with an ID of 0. In addition, each subscription to a Topic corresponds to a consumer thread, which employs a polling method to obtain messages from BookKeeper.

Publish messages

1. Write messages to the corresponding Ledger. If the Ledger does not exist, create it.

2. Update the current subscription metadata in Ledger0 (topic, ledgerId, offset, maxSize, etc.)

Subscribe to Topic and consume

1. Check if the Topic exists; if not, create the corresponding Ledger and update the metadata in Ledger0.

2. Check if a local consumer thread for subscribing to the Topic has been created; if not, create one.

3. The consumer thread loops over the messages of the topic. Before each consumption, it first retrieves the subscription metadata from Ledger0, mainly to obtain the message consumption position (Offset), and then reads the message at the Offset position in the corresponding Ledger.

4. If the offset needs to be changed after consumption, update the data in Ledger0 again.

Interface usage

1. Data is written through the Ledger API, with each Topic corresponding to a Ledger.

2. Metadata is written through the Advanced Ledger API, specifying the Ledger ID and Entry ID to ensure that the subscription state can be restored after restart.

Performance optimization

This scheme requires frequent updating of metadata, which will affect overall performance. Consider making the writing of metadata asynchronous to ensure eventual consistency.

Version selection

Since the Advanced Ledger API of BookKeeper is supported after version 4.6, the BookKeeper version required is greater than or equal to 4.6.

EventMesh Pravega Connector research and implementation

Pravega Introduction

Pravega is a cloud-native distributed stream storage system. Pravega streams are persistent, consistent, and scalable, and support long-term data retention. Pravega is a streaming storage system, similar to message middleware, and supports the publish-subscribe message pattern. However, by using a two-tier storage system to separate hot and cold data, Pravega can exceed the limitations of single-machine storage capacity by using cloud storage, achieving permanent data storage. The first tier stores hot data persistently, typically using BookKeeper. The second tier uses cloud storage like HDFS or cloud vendor OSS services for the permanent persistence of cold data.

Pravega’s logical storage structure has four layers: Event, Stream, Scope, and Pravega. An Event is a record, a Stream is a collection of Entries, a Scope is used for grouping and isolating Streams, and Pravega is the top-level Pravega service.

The core of Pravega is Stream, which is uniquely identified by its name and the Scope it belongs to. By dividing it into multiple Segments, it supports automatic scaling and concurrent access, and Segments are transparent to users. In Pravega, a Reader must belong to a Reader Group, and it is guaranteed that each Event will be sent to one Reader in the Reader Group.

The complete operation of Pravega requires Zookeeper, BookKeeper, and HDFS. Its architecture diagram is as follows:

EventMesh Pravega Connector Design

Pravega has a 4-layer logical storage structure: Event, Stream, Scope, and Pravega. Naturally, a Topic can be mapped to a Stream in a specified Scope, while messages correspond to Pravega’s Events. Since EventStreamWriter and EventStreamReader need to specify Scope and Stream when created, each Topic also corresponds to an EventStreamWriter and a ReaderGroup (containing only one EventStreamReader). In addition, each Topic’s subscription thread SubscribeTask will continuously read Events from the corresponding Stream. The architecture diagram is roughly as follows:

In the specific implementation, since the message offset in the Stream is not returned after the Pravega Event is successfully published, the Pravega Connector defaults to returning -1 as messageId to the user. Since Pravega’s ReaderGroup only guarantees that messages are delivered to one of its Readers, when subscribing to the cluster message mode, the naming format of the ReaderGroup is String.format(“%s-%s”, consumerGroup, topic), and when subscribing to the broadcast message mode, the ReaderGroup’s name is a random UUID. The naming format of the Reader is fixed to String.format(“%s-reader”, instanceName).replaceAll(“\\(“, “-”).replaceAll(“\\)”, “-”), where instanceName is a variable from the EventMesh upper layer, composed of consumerGroup, configuration eventMesh.server.cluster, EventMesh version, and PID.

In addition, since Pravega’s names and IDs must only consist of 0–9, A-Z, a-z, ., and -, users need to pay attention to the character composition of parameters such as Topics during usage.

EventMesh-Pravega Connector Implementation

The related PRs are as follows:

1) [ISSUE #270] Pravega connector #1203: https://github.com/apache/incubator-eventmesh/pull/1203

2) [ISSUE #1246] Add README.md for pravega connector #1242: https://github.com/apache/incubator-eventmesh/pull/1242

3) [ISSUE #1243] Upgrade pravega connector #1237: https://github.com/apache/incubator-eventmesh/pull/1237

4) [ISSUE #1345] Add pravega connector related license files #1346: https://github.com/apache/incubator-eventmesh/pull/1346

Follow-up

Participating in the Summer of Open Source

ospp: Please briefly introduce yourself and your open-source experience.
Liu Wei: My first exposure to open source was the Summer of Open Source in 2021. At that time, the project was the SkyWalking IoTDB adapter (storage plugin), which is similar in development idea to this year’s EventMesh Connector (event storage plugin), both based on the Java SPI mechanism. After the Summer of Open Source last year, I continued to maintain the plugin in SkyWalking for a while, and later participated in the development projects of InfluxDB protocol compatibility, MQTT protocol new cluster version upgrade, and built-in data forwarding function ForwardTrigger in the IoTDB community. In this year’s Summer of Open Source, I researched two or three message middleware and their underlying storage, and finally implemented the EventMesh Pravega Connector plugin.

ospp: Through understanding, this is your second time participating in the Summer of Open Source. What different experiences do you have this year compared to the first time?
Liu Wei: This year’s Summer of Open Source, I can clearly feel that my efficiency has improved a lot, thanks to the growth of the past year. On the one hand, I am more familiar with the open-source collaboration model, and on the other hand, my technical level and understanding ability have improved. Although both projects in the Summer of Open Source are “adapter” type projects, the difficulty of last year’s project was higher, requiring me to

ospp: As an open-source veteran, what do you think are the essential preparations for participating in the event?
Liu Wei: First of all, you need to have a certain professional foundation, and evaluate and choose the projects that suit you based on your own foundation. Secondly, you need to have an open mindset and be willing to put in the effort. On the one hand, the Summer of Open Source is an excellent platform for students to get in touch with the industry and improve themselves. Seizing opportunities and striving hard will enable you to gain knowledge that you cannot learn in school. On the other hand, the projects in the Summer of Open Source are often more challenging for students who only have school project experience. It is necessary to learn about the open-source collaboration model and various professional skills in practice. Being unprepared to put in a lot of effort won’t work. Furthermore, pay attention to norms and politeness, adhere to the community’s development and communication norms, and be polite, fluent, and grammatically correct in both English and Chinese. These details show your respect for the project and the community.

ospp: How do you view the Summer of Open Source and open-source education?
Liu Wei: In open-source education, or student participation in open-source projects, I think it’s essential to cultivate and improve one’s engineering thinking. Most of the projects in the Summer of Open Source are suitable for a student (part-time with a non-strong foundation) to complete the development of an independent module in two to three months. This mode is different from enterprise development internships. I think the Summer of Open Source is an excellent platform to cultivate students’ engineering thinking and improve themselves in various aspects, such as requirement analysis, software design, software testing, system deployment, and team collaboration.

Participating in the open-source community

ospp: Introduce the Apache EventMesh community in your eyes.

Liu Wei: Apache EventMesh is the infrastructure for cloud-native event-driven architecture and a project donated by WeBank. It employs a plug-in approach in design, with storage, monitoring, service discovery, and communication protocols all modularized. Currently, it is also promoting the development and improvement of the Go language version and workflow module and is in a stage of rapid development. The community is also very open, offering Good First Issues that are very welcoming for everyone to participate in.

ospp: Did you have any memorable experiences during the project? How did the community and mentors help you?

Liu Wei: In July, I basically started the exploratory development of the DLedger Connector, but found myself stuck in implementing the publish-subscribe mechanism. At that time, I had just got to know Boss Hu from the community on WeChat, and we talked for a whole night about the project, open-source education, and the purpose of participating in open-source, and I learned a lot. Participating in open-source can often lead to unexpected surprises, and you can meet many mentor-friend partners. People from different backgrounds join the open-source community with different purposes, and together they design, develop, and maintain the same project, which is a very interesting experience. For this project, my mentor, Sheng Ge, mainly guided the overall direction of the development and, together with other community partners, listened to my explanation of the project design at the biweekly meeting, giving me some suggestions and guidance, and finally reviewed my code. Once again, I am grateful to Sheng Ge, Wei Ming Ge, Boss Hu, and other community partners.

ospp: After the conclusion of this project, will you continue to actively participate in the open-source community and contribute to the development and maintenance of open-source software?

Liu Wei: Yes, I will. After the project is completed, I shared open-source culture and this project with my classmates in the research lab, fostering a sense of open-source evangelism. Then, I invited them to join the EventMesh community and started by getting familiar with the collaboration style of the open-source community through Good First Issues. I will continue to pay attention to open source and participate in open-source projects in the future.

Gains and Messages

ospp: The Summer of Open Source aims to cultivate and discover more outstanding developers. Did you gain any unexpected growth or experience through this event?

Liu Wei: Initially, when evaluating this DLedger Connector project, I felt it belonged to the “adapter” type of project and was of moderate difficulty since it didn’t involve architectural design optimization. However, I didn’t expect that when I actually started working on it, I realized that DLedger had difficulty meeting the requirements of the EventMesh Connector module. It required implementing the publish-subscribe mechanism by myself, and its metadata storage and management were extremely complex, gradually exceeding the design and positioning of the Connector module. So I later researched BookKeeper and Pravega and eventually implemented the Pravega Connector. As a result, I studied two or three message middleware and their underlying storage, and I felt the pattern expanding.

ospp: What experiences can you share with open-source-loving juniors?

Liu Wei: Now in China, more and more emphasis is placed on the development of open-source software, and events like the Summer of Open Source are organized. In addition to the Summer of Open Source, there are many similar events, and of course, I think the Summer of Open Source is the best. For the first time, you can participate in open-source through such events in a project-driven way. After the project is completed, you will feel that you have contributed and created value based on your professional abilities, and you can enjoy the long-term and profound happiness. You can make it one of the driving forces for your progress.

Summary

During the Summer of Open Source, we conducted research on two append-only storages, DLedger and BookKeeper, as well as their upper-layer message middleware, RocketMQ and Pulsar.

Since DLedger and BookKeeper only support append-only storage, we needed to implement a publish-subscribe mechanism on top of their interfaces to meet the requirements of the EventMesh Connector module. However, persisting the metadata generated by the publish-subscribe mechanism introduced new storage requirements. DLedger, which can only maintain a single queue, can hardly meet the simultaneous data storage and metadata storage requirements. Although BookKeeper can maintain multiple queues, multiple queries are still required to implement the publish-subscribe mechanism, resulting in higher complexity of the solution. The final implementation of the code also faced challenges in terms of reliability and maintainability. Therefore, both DLedger and BookKeeper are not well-suited for the use case of the EventMesh Connector.

Subsequently, we researched Pravega, a streaming storage system whose functional interface is similar to message middleware products. It supports the publish-subscribe message pattern and can meet the requirements of the EventMesh Connector module. Finally, we implemented the Pravega Connector plugin.

Overall, although the Summer of Open Source had some twists and turns, we learned about the characteristics and designs of several message middleware and their underlying append-only storages, gaining valuable knowledge. Additionally, we made progress in code implementation and understanding of system architecture. We would like to express our gratitude to project mentor Sheng and other community partners for their guidance and assistance.

About Us

We are WeBank CoreTech, the core technology team of WeBank. Our team has incubated top-tier open-source projects such as Apache Linkis and Apache EventMesh. For any inquiries or collaborations, please contact us.
Email: webankcoretech@gmail.com
Twitter:

WeBank CoreTech