From Beanstalks to ECS, making workloads smaller and faster

Bogdan Katishev
VRT Digital Products
7 min readOct 12, 2022

Here at VRT Digital Products, we recently switched from Beanstalks to ECS. In this blog, I will talk about the challenges we faced, how we solved them and what we learned along the way.

History

Since the dawn of VRT Digital Products, we have used Beanstalks (spun up by Cloudformation) for our application stacks, what seemed very simple management in the beginning, quickly evolved in a frustration cycle.

Simply put: Beanstalk could not keep up with our need for flexible and complex setups.

Issues Ops faced with Beanstalk

  • Immutable deploy time → slow
  • Ebextensions are terrible for managing infra
    → Hard to test
    → Rollback mechanism is poor
    → Debugging/error handling is poor
    → Hard to track (de-centralized config files)
  • Grabbing SecureString params from ssm ps is not supported by default
    → Custom solution needed
    → Feels hacky
  • Losing hair

Config management

Custom (and hacky) solutions were needed to centralize config management for all the Beanstalks.

Deployment

Because we are using Cloudformation for spinning up our stacks, we noticed that the deploy is a wrapper in wrapper (one cfn stack with the Beanstalk resources, spins up another cfn stack with the ec2/iam resources, etc.), which for us means that we are losing grip of our Beanstalk stack.

In all of this, we also noticed that this setup can sometimes cause drift because:

  • ebextensions/command fails
    → Does not notify Cloudformation stack that deploy has failed
    → Custom solution(s) are needed to catch this behavior

In a nutshell: less experienced users can not follow the workflow or are scared because of previous missteps.

Developers problem with Beanstalk

We also asked our developers what their main frustration points were:

  • False positives in deployment
  • Deployment is slow
  • Memory not a standard metric in Beanstalk
  • Hard to mimic local development environment
  • Platform too attached to app

Choosing a successor for Beanstalk

In this phase, we wanted to re-architecture our whole application setup. Not only did we want fast deploy times, we also wanted to have full control, with Cloudformation, over the stacks that we deployed.

The ebextension hell was also a focus point. It was time to better centralize the configuration.

We also wanted to have easy access to the Parameter store, without having to write a custom tool. For Beanstalk, we were using a platform hook derived from this blog: https://candrews.integralblue.com/2019/10/using-dynamic-references-to-aws-systems-manager-parameter-store-secure-strings-with-elastic-beanstalk/.

We also wanted to improve the overall performance of builds/deploys and also the stability of deployments in general.

And let’s not forget about security. With this we mean: frequent OS updates, scanning application dependencies (node modules/composer packages/maven artifacts etc…).

All in all we needed to have more flexibility/customization support.

Reviewing the old app stack

The biggest frustration that everyone had was slow deployment cycles, so while reviewing the stack, we were aiming to tackle that.

With this point of view, we quickly noticed that the app itself is too bundled and too bloated with other components. These other components were infrastructure related, some examples:

  • Reverse proxy (nginx/apache)
  • Various prometheus exporters
  • Consul agent
  • Scheduled tasks (cronjobs)

We understood that if we wanted to have faster application deploy times, we needed to detach these components from the core application.

When looking at the Parameter store part, AWS should offer native support of the service we’re utilizing.

To improve performance of builds/deploys, we needed to have caching/reproducibility possibilities and faster deployment strategies.

To improve stability of deployments, we don’t want to have false positives after a deploy. This means avoiding drift and having an one-to-one mapping between the deployed template and the actual resources.

For the centralizing configuration part we used our current Puppet stack which we already used for non-Beanstalk setups.

Making of the new app stack:

ECS Task

When we chose ECS as the “Beanstalk successor”, we introduced some new components and new ways of working in the new stack.

As we wanted to detach all of the infra components from the main stack, to achieve faster app deployment times, we ended up with a diagram like this.

Since we are using containers, we can adopt the sidecar pattern to achieve this. We can detach every component we think that should run in its own container and we can do this without losing any functionalities.

The end result is that we have an ECS-task, where our primary “core” app is running in its standalone container, while being supported by multiple sidecar containers which extend the functionalities of the core app.

Improve performance and stability of builds

Instead of building the entire application from scratch again and again, we leverage layer caching when building container images in our pipelines, this way we can gain speed when rebuilding images that have partly been changed.

Most of our applications also need to be compiled, so we use multi-staged builds when building these applications, since we are only interested in keeping the compiled artifacts in our final image and throwing everything else away, thus keeping the image size down.

# Dockerfile
# Build stage
FROM node:16.17.1 as build
WORKDIR /artifact
COPY artifact/ .
RUN npm run build
...
...
...
# Production stage
FROM node:16.17.1-apline
COPY --from=build /artifact /app
...
...

At the end of the build, we verify our container image by running specific smoke tests on it by using the Pytest framework. Because our container images are scheduled to be build automatically every week. By running these smoke tests we prevent pushing corrupted images to our internal hosted docker repository.

After that, we run a final security scan on our locally build image by using Trivy.

Finally, we push the container image to our internal hosted docker repository and write the image metadata such as image name and tagging to the Parameter store.

Improve performance and stability of deploys

We deploy our ECS-tasks using the same way we did with Beanstalk: through a Cloudformation template. In this template we include a custom resource named: ServiceImageParams, which fetches all the needed values of a container image we want from the Parameter store. You can learn more about this custom resource from this blog post: Complex AWS ECS container version management with AWS Systems Manager Parameter Store.

For the deployment strategy, we use rolling deployment because it is fast and reliable. Because we adopted the sidecar pattern, we also only update images that have been changed, making the deployment surface smaller and faster.

We also feel a better integration between platform and application, since they are both bundled in one container image.

After deployment, we do not have false positives anymore because we have an one-to-one mapping between the deployed template and the actual resources. So we eliminated the extra “uncontrollable” layer that Beanstalk gave us.

Centralize configuration

We used Puppet to bootstrap our immutable docker images.

We use a base Debian slim container image where we apply Puppet on, that way we can control what is installed on the image, if everything is fetched from our internal repositories and apply the recent security (mainly kernel) patches to keep our images secure.

More flexibility/customization

ECS also offers us more flexibility and customization options:

  • Define soft/hard limits for container CPU/Memory usage
  • Pinning container images on specific tags
  • Port mapping
  • Resolve secret strings from SSM PS (natively supported)
  • Start/Stop timeout

User-friendly

We also noticed some user friendliness along the flow. Everyone with access to the AWS console can easily view container logs. AWS Monitoring dashboard allows easily viewing of container metrics (cpu/memory usage etc…).

Developers can also easily mimic the current development/production environment by “just” building the image, which is “just” one Dockerfile. Or by pulling the staging/production image from our hosted docker repository.

We also give our developers/operations the possibility to have interactive access on our running containers in our development cluster by using this handy tool: ecsgo.

Conclusion

Reviewing our old app tech stack was a good practice for us to realize what we have been supporting and maintaining the last couple of years. It was a good “pause” for us to stand still and realize that we had reached a limit and that the limitations of Beanstalk were blocking us into achieving more complex but robust setups.

Adopting the way of working with micro services was not easy, but paid off in the end. We tackled multiple of our most painful bullet points and have sped up our general development and operation process. Not only is our infrastructure more robust and secure, we also achieved faster response times of our applications.

All in all, this was a great exercise and execution of our new app tech stack.

--

--