Redefining Getsafe’s Infrastructure: Migration Journey from Heroku to AWS [Part 1]

Anar Bayramov
Getsafe
Published in
7 min readJul 23, 2024

Getsafe, as a fast-growing company, found that our evolving needs were increasingly difficult for Heroku to meet. In this blog series we want to share our journey and learnings how we moved from Heroku to AWS and Qovery. We will share the reasons behind our migration, reasons led us to select our new infrastructure setup, and an overview of the architecture we are going to explain in upcoming parts.

Update: Part 2 is available!

Challenges with Heroku

Despite Heroku’s strengths and developer friendly environment, with our phase of growth we encountered significant limitations that hindered our progress.

To name a few of those limitations;

HTTP2/Support: Heroku’s lack of HTTP/2 support restricted our ability to optimize web performance and ensure efficient communication between our services. (Funnily HTTP/2 is public beta by Heroku now)
Prolonged Downtimes:
We faced prolonged downtimes during database version upgrades, disrupting our operations and impacting our service availability.
Private VPC: Heroku only offers private VPC to enterprise customers, which incurs an additional cost for us.

Besides Heroku’s technical constraints, we wanted to have enhanced security, have better control over costs and infrastructure while following industry standards such as kubernetes, terraform, docker.

Our growing demands and our motivation to stay up to date with industry standard highlighted the necessity for a more up to date and flexible platform. The absence of these features in Heroku for a prolonged period signaled to us that there was insufficient investment in enhancing the product, which eventually led us to explore alternatives.

Choosing AWS + Qovery

After investigating some alternatives we’ve concluded that AWS is right solution for our infrastructure. Mainly because AWS:

  • GDPR compliant.
  • Ensures our customer data and services remain available with minimal downtime.
  • Provides advanced security measures without incurring additional costs.
  • Gives us more power and flexibility to be able to tailor our infrastructure by using their managed and self-managed solutions.

Additionally, as we were already partially using AWS in our various engineering and data teams, our transition would be simpler.

Addressing Developer Experience
Since we don’t have dedicated devops team; one important criteria was to have to an intuitive, developer-friendly infrastructure by avoiding any steep learning curves such as kubernetes.
This is when we found Qovery, a platform that seamlessly bridged this gap for us. Qovery offers an easy way to create and manage services within our own AWS accounts using either through UI or terraform, it uses kubernetes underneath while abstracting much of the kubernetes specific infrastructure complexities from end users. Also they actively develop their product with public roadmap and have an active community!

Qovery enabled us to have the best of both worlds with a developer experience close to that of Heroku, but with control where we wanted it, ensuring our our development and operations remained efficient and user-friendly.

Preparations for Migration

At the point where we decided to begin the migration to AWS and Qovery, we had 11 production and 11 staging services. Each comprising multiple components, including primary and replica databases, redis instances, multiple sidekiq workers (background processor for ruby), and rails web services. Some case studies from some of our migrations, we will be sharing in upcoming articles.

Dockerizing Services

As part of the migration, we needed to dockerize our services since our Heroku infrastructure relied on procfiles and buildpacks, which was something we wanted to improve for our new setup. Fortunately, most of our services were already dockerized for local development, so the task mainly involved optimizing these docker configurations for both production and staging environments.

We decided to grant teams the freedom to manage their own dockerization process. This allowed them to choose their preferred base images and decide whether to use multi-stage dockerfiles from the start. We also wanted teams gain more knowledge and confidence in working with docker which in near future will be our default development environment.

Separate AWS accounts

Since one of our primary goals was to enhance security and protect our customer data, we decided to create two new AWS accounts — one for staging and one for production.

Our aim was to provide safe infrastructure environment where engineers could freely test, experiment, and modify infrastructure without the fear of impacting production systems. By creating a dedicated staging account that mirrored our production setup, developers could innovate and troubleshoot in a risk-free setting.

Terraform Foundation

One thing we wanted to improve was achieving identical setups between our staging and production infrastructure. For this reason we decided to apply any changes on our fresh AWS accounts through terraform. As we were already using terraform in data engineering we decided to follow the same approach for our backend and frontend services too.
In the following is our current terraform structure. In this article we will explain some modules we created during preparation phase, some other parts we’ll dive deeper in upcoming articles.

├── environments
│ ├── production
│ │ ├── ci-oidc-cr-pipeline.tf
│ │ ├── ecr.tf
│ │ ├── elasticache.tf
│ │ ├── iam-clients.tf
│ │ ├── idp-roles.tf
│ │ ├── kafka.tf
│ │ ├── main.tf
│ │ ├── outputs.tf
│ │ ├── pg-rds.tf
│ │ ├── s3.tf
│ │ ├── terraform.tf
│ │ ├── terraform.tfvars
│ │ └── variables.tf
│ ├── staging
│ │ ├── ci-oidc-cr-pipeline.tf
│ │ ├── ecr.tf
│ │ ├── elasticache.tf
│ │ ├── iam-clients.tf
│ │ ├── idp-roles.tf
│ │ ├── kafka.tf
│ │ ├── main.tf
│ │ ├── outputs.tf
│ │ ├── pg-rds.tf
│ │ ├── s3.tf
│ │ ├── terraform.tf
│ │ ├── terraform.tfvars
│ │ └── variables.tf
├── helm
│ └── datadog
│ ├── production
│ └── staging
└── modules
├── backup
├── ci-oidc-cr-pipeline
├── ecr
├── elasticache
│ ├── cache
│ ├── shared-resources
│ └── sidekiq-backend
├── iam-clients
│ ├── claims-service
│ ├── mlflow
│ ├── operations-service
│ ├── prompt-service
│ └── qovery-deployment
├── idp-roles
│ ├── child-roles
│ └── idp
├── kafka
├── kms
├── open-search
├── pg-rds
│ └── shared-resources
├── s3
│ └── generic_private_buckets
├── stateful-vpc
├── qovery
└── vpn

Similar to our AWS account approach, we also created 2 terraform environments with shared modules.

Security and Network Architecture: To achieve a secure and organized network architecture, we established three Virtual Private Clouds (VPCs):

  • Kafka: Dedicated to Kafka cluster
  • Stateful-VPC: VPC for Redis, Postgres and OpenSearch
  • Qovery: Isolated VPC for Qovery which is peered by Kafka and Stateful-VPC

This separation ensures that each component has its own isolated environment, enhancing security and performance. Stateful-VPC is also peered with our data AWS account so that data team can have read-only access to our replica databases. But neither of those VPC’s has public access. Recently we provided VPN access to Stateful-VPC for staging but production data is still isolated.

Separate IAM Clients: Each service was assigned its own unique access policies and access keys, with permissions tailored to its specific needs. Access scopes were dynamically defined to ensure strict resource isolation and security. For instance, Service A was granted access exclusively to its own S3 buckets and Bedrock instances, while Service B had permissions solely for its respective S3 buckets. This fine-grained access control not only enhanced security by limiting each service to only the resources it required but also streamlined management by ensuring that services operated within clearly defined boundaries.

ClickOps vs IaC?

Qovery terraform provider is a great solution to integrate your terraform solution with Qovery and it was our initial approach. Although we encountered no major issues, feedback from our engineering team highlighted that the Qovery UI was significantly more developer-friendly and efficient. Given our need to frequently scale resources and modify various services and components on a daily basis, we ultimately decided to transition from using the Qovery terraform provider to leveraging their extremely simple and intuitive UI. This shift allowed us to streamline our operations and improve productivity by taking full advantage of Qovery’s user-centric design. It looks like Qovery now supports a GitOps solution for resource management too.

Here is some screenshots from Qovery UI.

If you want to know more about Qovery, Romaric shares on an almost daily basis amazing new features.

Postgres and Redis Instances

Qovery offers also a UI solution for creating PostgreSQL and Redis instances, which we initially explored. However, we soon discovered that certain specific configurations we required could not be adjusted through the Qovery interface. As a result, we decided to create and manage our PostgreSQL and Redis instances directly within AWS RDS terraform resources. This approach provided us with the flexibility to tailor our database configurations precisely to our needs and compatibility with our services. For the record, it’s been a while since we haven’t investigated Qovery database creation solution. Maybe now those configurations could be achieved via their UI too.

This was our preparation part for migration.

Kudos to: Getsafe engineering team for their hardwork and Qovery team to make this migration happen so smooth!

In Part 2, we will detail our experience with:

  • Preparing our codebase to support both Heroku and Qovery environments concurrently, ensuring a smooth transition and continued operational efficiency.
  • Migrating our initial staging application.
  • Strategies we employed to minimize downtime and maintain data consistency during the migration process, achieving an average downtime of just 5 minutes per service!

Stay Tuned!
Anar Bayramov

--

--

Getsafe
Getsafe

Published in Getsafe

Getsafe is insurance redefined. We empower people to take control of their futures by making informed life decisions and protecting what matters most.

No responses yet