Building an Integration for 1000+ Apps in 4 Months

Jan Carsten Lohmüller
doodle-tech
Published in
7 min readMar 22, 2019

Imagine the following: You create an awesome Doodle to invite all your friends to your birthday party. You get very excited and would like to call your mum every time someone participates in the Doodle.

That would be a lot of work, right?

Plus: You would probably get overexcited. Well, we got you covered.

Doodle now integrates with 1000+ apps to make that happen. You can simply hook up a trigger every time one of your friends participates in your Doodle and let another service call your mum — via Zapier.

Zapier is a workflow automation system that basically allows users to chain events (“triggers”) in one app to features (“actions”) in another app.

We, at Doodle, have the grand plan to “Build an ecosystem of tools that integrate Doodle”, to “Cater to large deployments at enterprises” like SSO and automation, and “Make meetings more effective by adding features and integrations to help with the full meeting lifecycle”.

While we have a general understanding that this will be needed to help Doodle grow, we have not discovered the concrete use-cases of users and the order of importance of these. Zapier shall help us (partially) answer the first question and get a user-base that is interested in integrations to talk to (to answer the second question).

Why a blog post?

We would like to share our story of building a Zapier integration with you. Why? Because all the Stack Overflow discussions, Medium blog posts and Github example repositories helped us a lot and we would like to contribute to that.

Come join us.

Here comes a quick roundup of our Zapier integration architecture and the very exciting paradigm shift we’re currently experiencing in our engineering team.

Let’s crack things up

Doodle is currently transitioning from a traditional monolithic backend towards a more modern, sustainable and event driven microservice approach.

To get a little more detailed: We are running mostly Java 10 services with Spring Boot 2.1.2 and Apache Kafka 2.0.1. From time to time you will also find a MongoDB and a little pinch of Python here and there. However our monolith is still operating on Java 8. Infrastructure-wise we are relying on AWS with Kubernetes and everything wrapped nicely into helm charts.

The first step towards a sustainable microservice architecture for us was to raise business events to Kafka every time a domain object changes in the monolith. In order to achieve consistency we went for using AVRO schemas for our events. We keep all our schemas in one repository which is carefully maintained by every backend developer because we aim for full schema compatibility.

The first big challenge we had to tackle was our legacy authentication mechanism. Well, we abstracted from that: In order to see results quickly we chose Keycloak for that purpose. The pros for Keycloak are, that it runs on standards and does what it’s supposed to do: Authentication and Authorisation, plus: You can easily implement your own so called Service Provider Interface (SPI) in order to hook up your legacy authentication endpoint to it. Keycloak will then simply use that legacy endpoint to leverage credentials validation.

After mastering the authentication flow we needed to integrate Zapiers webhooks into our eventing ecosystem. Zapier is able to register a webhook in our app in order to point our trigger to the correct zap. The solution is yet simple and beautiful: Zapier subscribes webhooks at our hooks service like so:

POST <subscribe_endpoint> \
-H Authenticated: Somehow \
-H Content-Type: application/json \
-d '{"target_url": "https://hooks.zapier.com/<unique_path>",
"event": "poll_created"}'

This subscription gets transformed into a Kafka event and directly emitted to our hook subscription topic — that’s the easy part.

Zapier service architecture overview at Doodle

Fasten your seatbelts

Let’s talk about the most interesting part: Kafka Streams. We currently offer the following triggers for our users in Zapier: Trigger, when a new Doodle gets created, someone participates in your Doodle and when you close your Doodle and pick a final option. Sounds easy, right? Well, take this:

At Doodle there are 30 million active users per day, creating Doodles, participating in them and closing Doodles. That’s a lot of data!

Let’s have a look at one particular example flow: You, as a user, would like to be notified on, let’s say Slack, whenever someone participates in your Doodle. What our streams application does is:

  1. Consume the topic of subscribed hooks
  2. Consume the topic of participations in all Doodles
  3. Join them by key user email
  4. Trigger the correct hook as soon as a participation happens
Joined streams

That sounds easy in theory but we had to take a long way to get there on which we learned a lot about building and operating Kafka Streams applications.

Learnings

Do not do interactive queries inside Kafka Streams iterators ☝️

Interactive queries are very useful when you want to query the current state of your application.

We got that wrong.

We used interactive queries inside the forEach of a kafka stream, calling another instance of our application to get a specific value if that value was not present on this instance from which we’re calling. We thought this might be a good idea because we read up about it on the Confluent Blog.

It was not.

This implementation ended in high latency inside of the streams threads, which then resulted in the stream transitioning into ERROR state. So we scaled out application to only one instance to make it work, but that’s not what we aimed for.

Never ever mix interactive queries with stream operations.

When you want to process corresponding values of different kafka topics in one instance of your application you have to repartition them to use the same key.

Same key, same partition.

After repartitioning the topics to the same key they can get joined and voilá: No interactive queries inside of streams needed, ergo: No crashing application.

We were able to scale up again.

Don’t let schema evolutions ruin your day

While developing our streams applications, we had to experiment with AVRO schemas. It occurred more and more often that we deployed a new version of our service to our staging environment and it would simply not work because the Schema Registry would tell us

409 : Incompatible Schema

Which ultimately ended up in a more or less painful search for the field, type, or default value we messed up. As you might remember: We aim for full compatibility of our schema. Here are the top two mistakes we made:

  • Type “string” or “null” must have a default value null
{   
"name": "description",
"type": [
"null",
"string"
],
"default": null
"doc": "Poll's description. Maximum of 512 chars"
}
  • Type “array” must have a default value [] (not “null” or null)
{   
"name": "hooks",
"default": [],
"type": {
"type": "array",
"items": {
...

But if you already messed up your schema registry and there is no way other than deleting the schema you can do as stated in the Confluent docs:

# Deletes all schema versions registered under the subject "Kafka-value"   curl -X DELETE http://localhost:8081/subjects/Kafka-value    # Deletes version 1 of the schema registered under subject "Kafka-value"   curl -X DELETE http://localhost:8081/subjects/Kafka-value/versions/1   # Deletes the most recently registered schema under subject "Kafka-value"   curl -X DELETE http://localhost:8081/subjects/Kafka-value/versions/latest   

Keep development, staging, and production as similar as possible — in theory

As said before, at Doodle there are 30 million active users per day. Imagine you test your app in a very protected staging environment where you basically control the amount of events that get emitted to Kafka.

You feel safe.

You decide to move your application to preproduction. Huge step! This environment should be the same as production. Very exciting but still a limited amount of Kafka events. Next up:

Production.

You enter the stage and fall. Fall over. Directly. Why? Well, did I just say 30 million active users? Try to emulate that in staging!

This is still a challenge we try to figure out at the very moment. If you have an idea how to tackle this challenge, feel free to reach out to us.

What’s next?

Building a Zapier integration is just the first step to rapidly get rid of our monolith. Kafka streams was potentially one of the reasons why we were able to release this integration within 4 months (almost hassle free of course). So much about architectural choices. I bet there is going to be another post about that soon.

But what about the Zapier integration?

We will add a few things:

  • Social Login, probably the most wanted feature. Currently you’re only able to log in via username and password. We want to give all users the possibility to use their Google/Facebook/Microsoft account to log in.
  • Actions! The first action we will add is obviously: Create a Doodle based on a trigger of another app.

--

--