Getting started with Apache Pulsar Functions written in Kotlin

Pierre-Yves Lebecq
7 min readJul 6, 2020

--

Apache Pulsar is an open-source distributed pub-sub messaging system packed with amazing features such as multi-tenancy, high performances, low latency, Geo-replication, and many more. If you want to learn more about Apache Pulsar features, concepts, and architecture, head over to their dedicated documentation.

One of the most interesting features is Pulsar Functions. They are lightweight compute processes that consume messages from one or multiple Pulsar topics, apply a user-supplied processing logic to each message, and optionally publish the result of the computation to another topic.

You can use them to create complex processing logic without deploying a separate system. They are heavily inspired by stream processing engine systems such as Apache Flink, Apache Heron, or Apache Storm, and “Function as a Service” (FaaS) cloud platforms like Amazon Web Services Lambda, Google Cloud Functions, and Azure Cloud Functions.

Apache Pulsar and Pulsar Functions are what allowed us to build Infinitic, a high performance, highly scalable, and low latency workflow engine. But getting started with Pulsar Functions when you know nothing about the Java development ecosystem can be quite challenging. In this post, I'll try to get you started, giving explanations about the tools we need, why we need them, and how to use them. And to make things even more fun, we will use Kotlin to write our functions!

Kotlin is a modern, statically-typed, general-purpose programming language with type inference. It is designed to interoperate fully with Java. Although Pulsar functions can be written in many different programming languages, Java is the only language that can use all of the features available to Pulsar Functions at the time of writing. Using Kotlin is a great way to have access to all the features of the Pulsar Java SDK, with a much more modern programming language. Let's get started!

I'm using a fresh Ubuntu 20.04 operating system so you will need to adapt some of the commands to match your operating system. You will most likely find installation instructions for your operating system in the documentation of the various tools we will use.

The first thing we will need to install is Docker. Docker allows you to run containers on your computer. A container is a way to package software in a standardized unit containing the code and all dependencies it needs to run so the application can run quickly and reliably on different computers. There is a Docker image for Apache Pulsar which will help us get started with Pulsar without any trouble. I’m using Docker version 19.03.8 at the time of writing. We are also going to use docker-compose. It's a tool that will read a YAML file describing different containers and their configuration and start all of them. We're not going to use a lot of containers but using docker-compose makes it easier to start our containers with properly setup ports and volumes using the command line.

We will use the apt to install Docker. I also install curl at the same time which will be needed later:

$ sudo apt install docker.io docker-compose curl
[...]
$ docker --version
Docker version 19.03.8, build afacb8b7f0
$ docker-compose --version
docker-compose version 1.25.0, build unknown

Optional Step: By default, you have to run Docker commands using the sudo command. To avoid having to use sudo for every docker command you use, you can add your user to the docker group using the following command:

$ sudo usermod -aG docker $USER

And also, by default, the Docker daemon does not start automatically. I use Docker a lot so I prefer to have it start automatically. You can achieve this with the following command:

$ sudo systemctl enable docker.service

Restart your computer to have your Docker daemon properly started and your group membership re-evaluated.

We also need a Java JDK to build and run our code. To install our Java JDK we will use SDKMAN!. It is a tool able to install multiple versions of Java JDK as well as other software and tools related to Java development.

Installing SDKMAN! can be done using their installation script:

$ curl -s "https://get.sdkman.io" | bash
[...]

The default Ubuntu 20.04 installation should already contain everything SDKMAN! needs to run. At the end of the installation, you need to either open a new terminal window or use the source command to reload your shell so it can properly load the SDKMAN! scripts that were added during installation.

$ source $HOME/.sdkman/bin/sdkman-init.sh

Now you can run the following command to make sure SDKMAN! runs properly:

$ sdk versionSDKMAN 5.8.3+506

Now we will install a Java JDK using SDKMAN!. The Docker image of Apache Pulsar we will use runs Java 8, and because Pulsar itself will run our functions, we’re going to build them using Java 8 also. You can run the following command to list all available Java versions:

$ sdk list java
The output of the "sdk list java" command.

I'll use the identifier "8.0.252-open", which corresponds to OpenJDK 8, a free and open-source implementation of the Java platform.

$ sdk install java 8.0.252-openDownloading: java 8.0.252-openIn progress...############################################################# 100,0%
############################################################# 100,0%
Repackaging Java 8.0.252-open...Done repackaging...Installing: java 8.0.252-open
Done installing!
Setting java 8.0.252-open as default.

When the installation is done you can verify if it's properly installed by running the following commands:

$ javac -version
javac 1.8.0_252
$ java -version
openjdk version "1.8.0_252"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_252-b09)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.252-b09, mixed mode)

Next, we will need to Kotlin compiler to compile our Kotlin code. SDKMAN! can also install it. I'll use version 1.3.72 which is the most recent at the time of writing, but feel free to find the most recent version using the command "sdk list kotlin".

$ sdk install kotlin 1.3.72

We can verify it's installed properly running the following command:

$ kotlinc -version
info: kotlinc-jvm 1.3.72 (JRE 1.8.0_252-b09)

Most of the resources I found when I was getting started were using an IDE to create your Java / Kotlin project and run it. Even though it's probably the easiest way to get started, I didn't want to get started that way because I knew at some point I would need to build my project and run unit tests using only the command line on a CI server, which will not use any IDE.

The be able to do just that, we will use Gradle. Gradle is an open-source build automation tool. Gradle is powered by plugins, which allows it to build projects for many different programming languages, targeting many different platforms. But I think it's mostly used in for JVM applications. Gradle will also handle the Java/Kotlin dependencies our code will need. You can see it as the supercharged equivalent of NPM/Yarn for Javascript projects, Composer for PHP projects, Bundler for Ruby projects, etc. Because Gradle can do much more than that, it can be quite complex to use sometimes, so I encourage you to take the time to read the documentation if you plan to do a real project.

SDKMAN! to the rescue once again! We can use it to install Gradle. I'll use version 6.5.0 but as always, you can find a most recent version using command "sdk list gradle".

$ sdk install gradle 6.5Downloading: gradle 6.5In progress...############################################################# 100,0%Installing: gradle 6.5
Done installing!
Setting gradle 6.5 as default.

And for peace of mind, we will as always make sure it's properly installed:

$ gradle --version

Welcome to Gradle 6.5!
Here are the highlights of this release:
- Experimental file-system watching
- Improved version ordering
- New samples
For more details see https://docs.gradle.org/6.5/release-notes.html------------------------------------------------------------
Gradle 6.5
------------------------------------------------------------
Build time: 2020-06-02 20:46:21 UTC
Revision: a27f41e4ae5e8a41ab9b19f8dd6d86d7b384dad4
Kotlin: 1.3.72
Groovy: 2.5.11
Ant: Apache Ant(TM) version 1.10.7 compiled on September 1 2019
JVM: 1.8.0_252 (Oracle Corporation 25.252-b09)
OS: Linux 5.4.0-39-generic amd64

Last, but not least, we will need a proper text editor to write our code. While it's possible to use Visual Studio Code or Sublime Text to write the code, Kotlin support in IntelliJ IDEA is amazing. I highly encourage you to use it. You can even use the "IntelliJ IDEA Community Edition" which is free and contains everything we will need.

The most straightforward way to install IntelliJ IDEA is to find it in the Ubuntu Software Center and click the green install button.

The IntelliJ IDEA Community Edition software page in the Ubuntu Software Center.

We're now ready to start writing functions!

That was quite a lot of things to install, so we will end this post here.

But I believe with all this you will have everything needed to work on your Pulsar Functions in a proper way.

In the next post, we will run Apache Pulsar, write a basic function, and run it. This will give you an overview of how to use Pulsar Functions before we explore more features and build something bigger!

--

--

Pierre-Yves Lebecq
Pierre-Yves Lebecq

Responses (1)