Kinesis vs SNS/SQS

John Gilbert
Nov 13 · 3 min read

i.e. first-class events vs ephemeral messages

I like to use Kinesis and SNS/SQS for entirely different purposes. There are plenty of posts available that compare and contrast the features, ilities and limits of these AWS services. You can somewhat use these services interchangeably. But I like to think of SNS/SQS as messaging services, whereas I think of Kinesis as a temporal persistence service. More specifically I think of SNS/SQS as ephemeral messaging and I think of Kinesis as a tool for creating and processing events as first-class citizens. I use SNS/SQS for intra-service messaging as needed and I use Kinesis for all inter-service communication as I discuss here and here.

All this is a matter of practice. As I mentioned, you can pretty much achieve the same messaging results with any of these services. But SNS and SQS delete messages once they have been acknowledged by a consumer, whereas Kinesis does not bother with acknowledgements and persists the records until they expire. This creates a totally different mental picture that is critical for creating event-first systems.

In event-first systems we treat events as first-class citizens. Autonomous services produces events as their state changes. These events are the ultimate source of truth. They live on forever in the data lake. We can replay them to seed new services and repair troubled services. We can use them as an audit trail and perform unforeseen historical analysis. In other words, events are not ephemeral.

I think using SNS/SQS within a service is great. For example, when I need an S3 notification I may hook it up to SNS then SQS then Lambda to publish a domain event to Kinesis to communicate the event to the system at large. But on the receiving side, here are a few more reasons why I prefer Kinesis for inter-service communication.

Stream processing is different because we have great control over the batch size. This allows us to optimize processing across related events. We can further optimize throughput with purposeful creation of partition keys. We can leverage the retry-until-expire nature to more easily self-heal and we alert on the increasing iterator age. This in turn forces us to design for idempotency. We can also utilize the stream-circuit-breaker pattern to set aside poison events by raising fault events, alerting on these faults and providing for later resubmission. This in turn forces us to design for order-tolerance. Idempotency and order-tolerance are crucial for maintaining data integrity in these eventually consistent systems.

If you want to do more than build an event-driven system, if you want to build an event-first system, then your events cannot be ephemeral messages. I will dive further into related topics, such stream topology and the shape of events, in separate posts.

For more thoughts on serverless and cloud-native checkout the other posts in this series and my books: Cloud Native Development Patterns and Best Practices and JavaScript Cloud Native Development Cookbook.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade