How I saved adding more space to RDS

Ayush Singh
Building Goalwise
Published in
4 min readJan 28, 2019

Working at a startup sometimes means coding and releasing projects in very short time under a lot of pressure. As a CTO it is my responsibility to put more thought into the design and make sure the architecture is robust and flexible. As I have always advocated it is more important to put in extra effort thinking over the design than to start coding and think on the fly while doing it.

The mistake

About 18 months ago there was a project that needed to be completed as soon as possible. We needed to show a copy of all the transactional emails that were being sent to the users on one of our internal dashboards. I was already working on another big project (redeem) for product at the time. So when the developer came with the idea of storing the html content of the email in a database table, without giving much thought to it I approved the project.

The problem

A couple of weeks back I got an automated email notification from AWS stating there was only 4% of free space remaining on the production database. This took me by surprise because I was not expecting to run low on space on production for another year or so. I started looking for the culprit which was eating up all the space. Surprisingly it was actual production data which was consuming the space. This was unexpected.

Free space available on the production database

Looking further in the table details I found that the table with the maximum size was the one storing the emails. On an average we send out about 1200 transactional emails per day. Over a period of a year and half this table had become a mammoth compared to the other tables in the database. How did this happen. Turns out what started as a simple transaction email store was now storing every email that was being sent from the system. Unfortunately I do not have the exact screenshot showing the size of the table from the time of the incident but it looked something as below.

SELECT
table_schema as "Database Name", table_name,
Sum(data_length + index_length) / 1024 / 1024 "Used Space in MB",
Sum(data_free) / 1024 / 1024 "Free Space in MB",
Sum(data_length + index_length + data_free) / 1024 / 1024 "Total Space in MB"
FROM information_schema.tables
GROUP BY table_schema, table_name
ORDER BY
SUM(data_length + index_length + data_free) DESC limit 1;
Size of the table was approximately 10 GB

The fix

The easiest solution would be to add more space to the database but this would be adding band-aid to a bleeding cut and not stitching it up. The rate at which the customer base is growing I would end up adding more and more space to the database every few months. The table was almost growing at about 750mb every 2 weeks.

A better solution would be to:

  • Stop storing the email content in the database and start storing it in html files in S3 and store the link to the file in the table.
  • Migrate the existing data in the table to S3.
  • Update the internal dashboard to handle both type of data stored in the column (link to file and also the actual html so that it does not break during the transition)

Thanks to some quick coding from Harpal Singh this was all done in couple of days.

Now it was time to optimise the table and release the allocated space.

OPTIMIZE TABLE <table name>;

I was hoping this would solve the issue but the command resulted in error optimize, error, Temporary file write failure. This meant I would actually have to add space to the database in order to release it. On talking to the AWS support they suggested to add the same amount of space as the size of the table. Well I thought great I will add some space, run the command and then reduce the size of the database. So I added about 10 GB of space and ran the command. It worked great and all I needed to do was to now reduce the size of the database.

Total free space after successful table optimsation.

The bummer

Turns out increasing the size of AWS RDS is easy to achieve but the same is not true for reducing the size even though the entire database size is significantly smaller than the total free space available. This is what the customer care suggested as a workaround to achieve what I was looking for.

I however did not follow these steps as a week later I need to migrate all AWS services to a different region. I will talk more about the migration in a later blog. Followed the steps outlined in the support ticket in the new region and finally got the database size as I wanted.

Conclusion

  • It is important to think what you are doing and how it is going to affect the overall product in the long run.
  • It is better to apply a permanent solution to a problem than go for temporary fixes.
  • Karma is a bitch :)

--

--

Ayush Singh
Building Goalwise

CTO @ Goalwise.com — a goal-based Mutual Fund investment platform powered by robo-advisory