Containers & Automation at Sortly

At Sortly Engineering, we strive for continuous improvement and high developer productivity to enable us in delivering significant innovation to our customers with high velocity. We strongly believe that we can achieve this by continuously evaluating and improving on three pillars : People, Technology and Processes.

In this post, we talk about some of the significant improvements we made in the last six months in the second pillar — our technology stack.

Our technology stack

Sortly experience consists of an extremely easy to use web app, a mobile app with rich offline support (both iOS and Android) and optional APIs for integrations with other systems. Sortly production environment that powers all these experiences runs primarily on AWS infrastructure. This includes our core Ruby on Rails(RoR) application, an asynchronous processing service based on sidekiq, elastic search infrastructure and all the supporting infrastructure services. We use AWS managed services such as Amazon Aurora, ElasticCache, OpenSearch, EventBridge, S3 to be able to focus on our business problem than infrastructure. In addition we also use multiple other services such as Datadog, Sentry, Firebase and other tools for monitoring and related areas.

3 big goals

As part of our continuous improvement, we looked at our current systems and determined that we needed to make a quantum jump to achieve below 3 big goals:

  1. Predictable, automated and incident-free deployments including stack upgrades.
  2. Enable micro-services, introduce Java services and reduce Ruby on Rails footprint in the long run.
  3. CI-CD : Internal environments closer to production environment, automation test results as the primary signal for releases, pipeline promotion and automated deployments.

As mentioned before, we were running our services on EC2 instances with the services deployed via Elastic Beanstalk and the managed services configured manually. This resulted in engineers having to spend a lot of time maintaining the infrastructure, do weekly releases (sometimes long hours due to error-prone manual steps), manage increased drift between environments etc.

To meet our stated goals, we looked at the state of our infrastructure and technology stack, identified key bottlenecks, evaluated multiple options, did few POCs to validate our assumptions and decided on the target state that we wanted to achieve.

  • Manual provisioning of infrastructure → Terraform IaC
  • EC2/VMs → Docker Containers
  • Elastic Beanstalk + EC2 → AWS Fargate + ECS
  • Manual release processes -> Automated releases via bitbucket pipelines

Execution

With the decision made on the technology stack, it was time to focus on the execution process. At Sortly engineering, we take big bold bets and march towards our big goals in smaller deployable chunks. This enables us to learn from production, do course correction and iterate towards achieving the big bold vision. We insist that our developers figure out a way to chunk the work, plan to deploy to production at the earliest and cut over to the new stack/big goal incrementally. There is no big bang V2 which we expect to

We took this approach, made multiple changes & deployments to production over the last 6 months. We feel good with what we achieved (and hence this blog 😁), especially with the the fact that we could do this without any production incident. We feel we are on our path to achieving the goals that we set out for ourselves at the onset of this transformation.

Developer productivity benefits

While we think it will take some more time for us to see the complete benefits of the changes and get a validation that all our stated goals are met, we are already seeing strong signals that it is working great with increased developer productivity:

  • We are now able to do backend deployments in a few minutes compared to about 2 hours before the change.
  • Our deployments are relatively stress free given the improved confidence in our release quality and deployment predictability.
  • We recently did a RoR stack minor version upgrade which was seamless due to docker. There was zero devops work (compared to EB driven deployments on EC2) given the change was only in the docker files owned by the development team. This was also made easy given that our development environment is also on docker now.
  • We are now able to do performance testing on demand given we now have the ability to spin up new environments in a matter of few minutes with our Terraform IAC, run our tests and tear down the infrastructure.
  • We are now on track to introduce our first Java service to implement blazing fast inventory query APIs of Sortly platform.

Few learnings

Despite best efforts, things don’t go according to plan. By taking up the work in small chunks and iterating as we progressed, we could mitigate some of the issues that we didn’t plan for at the start. We faced multiple hurdles along the way but by having intermediate ship-milestones, we could course correct and stay focused on our goals.

Factor for fine tuning time post any major infrastructure changes. We hit upon few issues especially around performance in places where we were least expecting. Sometimes, some of the underlying bugs get amplified in the new environment. Paying close attention to important metrics helped in catching these issues early and fixing them before they become major production incidents.

Communication & team buy-in is important —With continuous communication & incorporating feedback from all stakeholders, we could get the buy from everybody involved leading to frictionless rollout.

Last but not the least, the benefits reinforced our belief in “If it can be automated, it should be automated. If it can’t be automated, rethink the design”.

What’s next

We feel good about the changes and think we are on a good footing for the short term to double down on our focus on automation. Longer term, there is lot to do for us in all the key technology areas of Sortly — core inventory platform, blazing fast UI, mobile apps, infrastructure and more. As we make changes in these areas, we will keep you posted on this handle. Stay tuned !

And yes, we are hiring !!

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store