Consume Slack’s Events API with Cloudera Flow Management

Ferenc Kis
Cloudera
Published in
9 min readJan 11, 2023

Without writing a single letter of code.

What is Cloudera Flow Management

Cloudera Flow Management (CFM) is a no-code data ingestion and management solution powered by Apache NiFi. With NiFi’s intuitive graphical interface and processors, CFM delivers highly scalable data movement, transformation, and management capabilities to the enterprise.
More information can be found at https://docs.cloudera.com/cfm/latest/index.html.

What is Slack

Slack is an instant messaging program owned by Salesforce. Originally it was developed for professional communications, but it has been widely adopted as a community platform. Users can communicate with text messaging, voice and video calls.
More information can be found at https://slack.com/.

Goal

In Slack users can communicate with text messages, voice calls, and are also able to send file attachments. Slack also offers a service called Events API which allows several types of events including channel creations, direct messages, etc to be pushed into an external system for further analysis. However the user still needs to write the application which is able to consume, process or forward such events. Fortunately Cloudera Flow Management comes to the rescue. In this blog post I will show how to create a sandbox Slack account and how to define a flow in CFM to consume events from Slack without writing any code.

Create a Slack sandbox account

This step can be skipped if you already have a Slack environment. However it is highly recommended to make changes in dev/test environments before moving them to production. Fortunately Slack offers a so-called sandbox environment where such experiments can be carried out.
Please follow the steps in the official guide.
You will need to submit a Google Form where some mandatory data needs to be given. Your request will be checked and if everything is found okay they will send you an email to finish the registration.
This can take some time, I got mine after 6 hours.

Finish the registration by clicking on the Finish Setup button.
Complete the org setup step by registering the Primary Org Owner account, setting your organization name, and finally accepting the terms of service.

Create a Slack workspace

You can also skip this if you already have a Slack organization with a workspace.
Click on the Manage Organization button.

Click on the Create Workspace button.

Give the name of the workspace and the workspace domain. It can be the same. I named mine pusheventpoc and pusheventpoc.slack.com respectively. Workspace description can be left empty.

Create a Slack app

Next you will need to create a Slack app in the workspace created in the previous step. This application will use Slack’s Events API and will push events to NiFi. More information about Slack’s Events API can be found in the Events API Documentation.
Go to https://api.slack.com/apps, and click on the Create New App button.

Choose From Scratch. Set an app name and choose the workspace you have created in the previous step.

Next we will set up a CFM flow to receive events from Slack. Do not close this browser window, it will be needed in a later step.

Installing and running CFM

If you already have a working CFM installation you can skip this step.

CFM can be installed as a service on a CDP cluster. For details and use cases please see the OFFICIAL GUIDE.

Also CFM can be downloaded and installed separately. The latest CFM binaries can be downloaded from https://docs.cloudera.com/cfm/latest/release-notes/topics/cfm-download-locations.html. Download the NiFi binary and extract it.
Then start it with the following command from the NiFi directory:

bin/nifi.sh start

Note: NiFi is started in secure mode out of the box and will ask for username and password during startup. If you started NiFi with default settings, you can find the generated user/password in logs/nifi-app.log.

Create the CFM flow

In this step we will design a NiFi flow in CFM which is able to consume events from Slack.
We will add HandleHttpRequest and HandleHttpResponse processors which will act as a web server where Slack events can be pushed to. The reason we can’t simply use a ListenHttpProcessor is that when we register the webhook URL in Slack, Slack will send a specific message containing a challenge token, which need to be sent back in order to assure that the registered endpoint is under the user’s direct control.
More about the handshake process can be found under this LINK.

Note: we added the LogAttribute processors just for debugging purposes.

In the next section we will configure these processors and finish configuring our Slack app, so we can finally try the whole stuff together.

Configure HandleHttpRequest processor

This processor is responsible for acting as a web server and receiving the events pushed by Slack. Only the mandatory properties for this use case will be discussed here.

Listening Port: Port to listen on for HTTP requests. This port needs to be accessible from the public internet, so consider firewalls and other security settings when setting this value. Example: I set this to 9876

HTTP Context Map: A controller service for caching the HTTP request information. Create a new StandardHttpContextMap with default parameters. More details about controller services can be found via LINK.

Allowed paths: Regular expression which specifies which HTTP paths are allowed in a request. This will be the part of the callback URL given for Slack. Example: I used /events.* value

Allow POST: Allow HTTP post methods to be sent. This is mandatory to set to true.

This is how the processor configuration looks like:

All other parameters are left as default values.

Configure EvaluateJsonPath processor

EvaluateJsonPath processor’s role is to extract the previously mentioned challenge token from the published Slack event if any. By adding the value as a flowfile-attribute we will be able to send back the challenge in the response thus making Slack happy and validating our callback URL.

Destination: Choose flowfile-attribute. By adding the extract value as a flow-file attribute, we can pass extra information to the downstream processors, while leaving the original content intact. We will reference the extracted value later to be included in the response.

Return Type: Choose scalar. We need only the plain text value

Path Not Found Behavior: Choose ignore. In normal Slack events the challenge token won’t be present but we don’t consider this an error.

Null Value Representation: Choose empty string. If the challenge token is not found in the request, the extracted attribute will be an empty string.

slack.challenge: This is a dynamic variable and needs to be added manually. The name can be chosen arbitrarily. The value should be $.challenge, which is the JSON path of the challenge token in the request payload.

Here is how the configured processor looks like:

Configure HandleHttpResponse processor

All requests issued by Slack need to be responded by an HTTP 200 OK response within 3 seconds, otherwise it will be considered failed and will be attempted to resend. Additionally minimum 5% of the requests need to be acknowledged over a 60 minutes window, otherwise event pushing will be disabled.
HandleHttpResponse processor will do all of the above for use. Furthermore if the challenge token is present, it will add it to the response as required by Slack.

HTTP Status Code: Set it to 200.

HTTP Context Map: Choose the same context map created for HandleHttpRequest processor.

Attributes to add to the HTTP Response (Regex): Use a regex here which matches on the attribute name you chose in the previous step when configuring EvaluateJsonProcessor. In my case I chose slack.challenge, so here I used slack.* regex.

Content-Type: Use text/plain

Here is how the configuration looks like:

Once we are done, let’s start the whole flow by clicking on the Play button in the Operate menu on the left side. After this point the HandleHttpRequest processor will be available and ready to receive events from Slack.

Note: if you don’t auto-terminate the LogAttribute processors’ success relationship, you can still start the flow, however the flowfiles will be queued up in the queue represented by the arrow between the source processor and the particular LogAttribute processor. This is extremely useful for debugging purposes, you can see on which route your flowfiles are passing through, and you can also check their content by enlisting the particular queue.

Configure the Slack app

Go back to the browser tab where you created the Slack app previously or visit https://api.slack.com/apps/ and select the application if you have closed the tab.
Click on Event Subscription in the left side menu and click on the toggle to enable events.
Once enabled, a text box will appear where a request URL has to be given. This is the URL where the Slack events will be pushed to.
The URL will be NiFi node’s IP plus the port you provided in the configuration section for HandleHttpRequest processor. Additionally if you filled in Allowed Paths add a matching path to the URL.
In my case the IP was 5.187.170.35, the port I set was 9876, and the Allowed Paths regex was /events.*, so my URL looks like this:

It is very important that the given URL needs to be publicly visible and accessible. If you have done everything correctly until this step, the URL verification will be successful.

On CFM side you should see one flowfile queued up connected to the success relationship of the HandleHttpResponse processor.

The flowfile should have a similar content.

On the same page you can configure which events you want to subscribe to. Only those events will be pushed to CFM which are configured here.

Here is how my settings look.

Install Slack Application

Before we can move on we need to install the Slack app, so it can start collecting events and push them to CFM.

Click on Install App on the left side then click on Install to Workspace.

Putting it together

Now everything is configured and set up, let’s open Slack and send a direct message to someone. If you have a newly created workspace and you are the only user, don’t worry. You can send direct messages to yourself as well.

For example I did the same and sent three messages for myself in Slack.

And on CFM side I can see three flowfiles piling up in the queue.

The integration is set up and working as expected.

Happiness and joy! :)

Conclusion

In this blog post we have seen how to integrate Slack and Cloudera Flow Management (CFM). We created a Slack sandbox account, then created a basic flow which is able to consume events from Slack in a push manner.
Hope you’ve found this post interesting and useful.
In the next blog post we’ll see how the previously created flow can be deployed to Cloudera DataFlow for the Public Cloud. Stay tuned.

--

--