No doubt that the max retention period of 7 days for Kinesis is a shortcoming, however, by firehosing the events to s3 we can store them forever. It’s not hard to imagine a tool that allows us to replay events from s3 from any point in time (It seems like https://github.com/cludden/s3-kinesis-replay accomplishes this, and we have a ticket in our backlog to hammer out a process for this). That said, with Amazon’s recent announcement of managed Kafka we will most likely soon be re-evaluating our choice of Kinesis over Kafka.
We did consider using Debezium, but originally opted for pg2kinesis because it seemed like it required little to no glue code. And once we were acquainted with pg2kineis, porting this library to Java seemed less risky than exploring a new solution entirely. That said, Debezium has been crucial to the development of pg2k4j, because we use the Debezium Postgres docker image in our integration tests: https://github.com/disneystreaming/pg2k4j/blob/master/src/test/java/com/disneystreaming/pg2k4j/containers/Postgres.java#L39 . So major thanks for helping us out there, and we’ll definitely keep Debezium in mind if we ever explore other sources and sinks in our CDC pipeline!