Halloween Serverless Stories — Part 2

Marin Radjenovic
3 min readOct 26, 2022

--

This time we have another story that is so stuck in my mind for two reasons. One reason is knowing that you can’t do anything smart at the moment and you need to let the disaster happen. Second is that I and the team learned a lot from the errors on how to handle serverless applications.

If you didn’t check 2 stories in the previous article, here is the link

There are one more now!! 🎃

Photo by Kenny Eliason on Unsplash

Story 3 — Big Bang

This story happened a few years ago…
Namely, there was a SaaS marketing solution that particular mobile phones producer used for their marketing campaigns. However, as the company had multiple marketing channels managed by different applications, there was an enormous amount of material to be copied from one application to another. Also that SaaS application was also costly.
We had a bright idea to utilize the content of CMS publications and convert it into an email channel. So the marketeer would use just one application, which will significantly reduce the amount of work and frustration.

So we have modified CMS which was a monolith application and extended it for new features, such as the creation of email content, setting targets, selection of segmentation filters and etc.

The email campaigns were planned to be from a few hundred thousand emails to several million. The acceptable throughput was 1m emails per day. We have tested and confirmed that it works according to the required NFRs or better. So we were so happy about our solution.

Everything worked perfectly for quite some time. We were able to send more than 5 million emails per day.
However, one day while having a few long-running global campaigns in parallel, marketers wanted to have an urgent one-day Valentine’s day offer.
So they published it…
Unfortunately, stats were showing no deliveries for more than a few hours. There was no email coming out for a whole day? Marketing managers were in a panic, more than half-day has already passed and no email has gone yet.
When we checked the logs everything was functioning perfectly, there were emails coming out but not for Valentine’s day campaign.
We have checked the campaign and everything seemed fine, except for the iterator age on kinesis that was triggering Email Sending Lambda. The iterator was 156960405 ms old or almost 2 days old. So if you publish a campaign right now, you would get the first emails coming out in 2 days :)
There was nothing smart we could do at that time. You may say “you could just re-shard it and scale it up” but kinesis re-sharding is something that we avoided for a good reason…
So we had to let the campaign fail and break so many hearts for Valentine’s day :(
Customers received emails two days after Valentin’s day had passed.

You can read the whole story on this link.

You can continue to part 3…

--

--

Marin Radjenovic

Cloud Architect. Developer. 2x Father. 7x AWS certified. AWS Community Builder. AWS UG Montenegro founder 🇲🇪. Working for Crayon