Combining the Best of AWS EventBridge and AWS Kinesis
Streams & Bridges: Better Together
I have been using AWS Kinesis extensively and effectively, with AWS Lambda, since 2015. I am a big fan. Now we have another great alternative, AWS EventBridge. Each has its strengths and weaknesses. In this post, I will introduce how I use them together to get the best of both worlds.
The Good and Not So Good
Kinesis is a great tool for creating autonomous services following the system wide event-sourcing and cqrs patterns. I discuss how Kinesis lends itself towards thinking of events as first-class citizens here, I discuss optimizing stream topology here, and I cover my aws-lambda-stream library here. All this is important because it gives teams confidence to accelerate their pace of innovation, knowing they have created fortified boundaries between services, with outbound and inbound bulkheads that limit the blast radius when things go wrong, as I point out here.
The major strengths of the Kinesis streaming approach are batch size, partition keys and back-pressure. Batch size helps control costs at high volumes. It also allows for various processing optimizations over related events in a batch that are correlated by the partition key. And the natural back-pressure provided by the functional reactive programming model and streams helps to avoid over running sink points that are throttled or have limited resources. It is pretty cool how easily we can implement robust and complex logic with this model.
However, Kinesis does not support routing. Thus, in a pure Kinesis streaming approach, there is some coupling between producers and consumers to the extent that the producer must choice which stream to send to and some consumers may have to ignore a significant number of event types if they consume from a dense stream. And as the number of consumers increases, so does the likelihood of Kinesis throttling, which increases latency. Creating an optimal stream topology can be a balancing act with some purposeful inefficiencies that are weighed against the over all cost. This dumb-pipe, smart-endpoints approach has made it easier to scale, because it reduces the responsibility of the communications middleware and spreads the routing load across the many smart endpoints.
Yet, AWS EventBridge is a fully managed, serverless, event routing service with implicit scalability. Routing allows producers to be completely decoupled from consumers, because producers can publish to a bus and consumers can subscribe to the exact messages they want. So at first glance this service might seem to be the better alternative. However, events are only delivered to Lambda functions one at a time. Without batching, this can have a significant impact on cost under heavy event volumes. We also loose the processing efficiencies and conveniences of batching, partitioning and back-pressure.
Ultimately, there is an exact inverse relationship between the strengths and weaknesses of Kinesis and EventBridge. Where Kinesis excels, Eventbridge does not, and vise versa. So let’s see how we can use them together for an optimal solution.
Publish to a Bus
As depicted in the diagram above, producers publish their domain events to a single AWS EventBridge bus. The events use a standard event format which is specified here.
Consume from a Stream
Consumers subscribe to a single stream. There may be many streams in the topology. Each stream defines its own routing rules to bridge the events between the bus and the steams. I typically start with a single stream and refine the rules as the system evolves. Some infrequent consumers may end up consuming straight from the bus. All events are routed to the event-lake via AWS Firehose.
Example: A Stream Routing Rule
The following is an example of an EventBridge routing rule for a Kinesis stream. A full event-hub kick-start service is available here. The EventPattern specifies that all events, except for
fault events, are routed to this default stream. The EventBridge
$.detail field contains the domain event and is passed along to Kinesis via the
partitionKey is also set. Consumers manage their own batch size and they filter out unwanted events using the aws-lambda-stream library.
- anything-but: fault
- Id: Stream1
Fn::GetAtt: [ Stream1, Arn ]
Fn::GetAtt: [ StreamRole, Arn ]
Example: The Event Lake Routing Rule
Next is an example of an EventBridge routing rule to an AWS Kinesis Firehose delivery stream that creates an event-lake in S3. A full event-lake-s3 kick-start service is available here. All events are routed to the event-lake to create a perpetual audit trail. An EventPattern is required, so the
source field of all events is set to a consistent value, such as
custom. The events are saved to S3 in batch files, so an EOF delimiter is appended using an InputTransformer. A separate, but similar, service is created to index all the events in ElasticSearch. Having separate rules for these cross-cutting concerns ensures that they are not competing with other consumers and increasing throttling.
- Id: EventLake
Fn::GetAtt: [ DeliveryStream, Arn ]
Fn::GetAtt: [ EventBridgeRole, Arn ]
Using AWS EventBridge and AWS Kinesis together, we get the complete decoupling provided by routing and the efficiencies of streaming. Publishers fire-and-forget and Consumers decide how best to receive the events of interest. As such, events can be routed to multiple channels making it easier to define an optimal stream topology that maximizes flexibility and throughput.
In a future post I will cover using EventBridge to create a external service gateway (ESG) between egress and ingress Kinesis streams in different accounts to integrate autonomous subsystems.