Webhooks Integration Best Practices

Exploring Various Potential Patterns

Alex Dorand
4 min readOct 2, 2023

Suppose we are writing an app to integrate with a system that exposes events through webhooks.

Throughout my career and in many projects, I witnessed developers primarily implement the Webhooks like the following diagram:

The implementation above is technically correct, but it lacks a few key features that we use event-driven architecture to have.

Let’s explore.

  1. Developers of System 1 implement their events and document them for potential consumers.
  2. Developers of System 2 then subscribe to System 1 webhooks. For now, let’s not focus on the security aspect of the webhooks.
  3. Developers of System 2 implement an anti-corruption layer to ensure changes in events in System 1 would not adversely affect System 2, and if there are changes, we can capture and report them on the live system.
  4. The ACL translates the request to an internal model.
  5. The internal model is then passed to a function in system 2 to deal with.

So what is the problem? Assuming we will add another system to benefit from the events produced in system 1, let’s call it system 3.

Another team implements System 3 and doesn’t know about system 2 consumption of the events. The solution would look like:

If System 3 doesn’t have the same SLA as System 1, we have a problem. If System 1 changes the event schema, then we have two issues. If security fails, then we have three challenges.

I can see your eyebrows are raising. Rightfully so.
The next course of action is to decouple the events from systems. So the solution would look something like:

It is much better now since we can control the flow of the events using a queue. But we still need to solve the SLA problem. If System 3 is 2x slower than System 1, we will have potentially missed events and even bring System 3 down.

I’d suggest we use an event bus to handle the events instead of passing the events to a queue. We must address things differently if we are mandated to preserve the orders of the flooding events.

All cloud providers have an events bus. For example, AWS has Event Bridge, and Azure has Event Grid. It is best to utilize them to handle the dispatching and routing of the messages. So the solution would look something like:

There are several advantages to this solution. System 1 can pass events at any speed, and the API gateway needs an agreed-upon schema so no random messages get in. Messages then get streamed to an event bus. The event bus filters the events based on the target and delivers the events. The next step is to make sure System 3 can receive events in a throttled way. So the solution would look like:

Let’s now add a legacy System 4 that cannot accept code change and add it to the mix. Here is how we throttle and integrate with system 4.

As you can see, the event integration using existing products, such as an event bus provided by your cloud provider, makes a lot of sense. This way, we are also adding a level of decoupling to the whole solution.

Conclusion

Depending on the situation, there are sometimes different ways of using event-driven integration. I explained some patterns by which you can integrate your systems into webhooks. There are a lot more.

The webhook integration solution depends on

  • Development Timelines
  • Development Expertise in Event-Driven design and implementation concepts
  • Security Limitation
  • Number of Downstream Systems
  • Types of Downstream Systems

Please leave in the comments how you have achieved event-driven architecture differently.

--

--

Alex Dorand

Solutions Architect with a great passion for everything that enhances our life's quality. Coffee addict, food lover and a travel junkie.