From Familiarity to Innovation: Navigating Our Artifactory Dilemma

How we concluded that we were not ready to make the switch to AWS ECR

Andrew den Hertog
Prodigy Engineering
5 min readOct 17, 2023

--

Tom Fisk (pexels.com)

Background

Approximately four years ago, we, the Infrastructure team at Prodigy Education, moved from Amazon ECR to JFrog Artifactory. This move was made not only for our Docker registry at the time but also for npm, as it only made sense to consolidate the registries on one host.

Today, we are looking at possibly moving off of JFrog for reasons I’ll mention later, and one of the options we considered for our Docker container images, and helm charts was returning to Amazon ECR.

A Little Bit of History

Five years ago, we were a development and engineering team that was growing fast. We relied on hosted npm for our managed packages, but it was configured there in a way that was not sustainable for the growth we were experiencing.

At the time, we used Amazon ECR for our container registry, and like npm, we had a sub-optimal setup. For one, we had a registry in each of our AWS accounts (Development, Pre-Production, and Production) with a build pipeline in each account, so there was no guarantee of consistency between images between accounts. Our solution would be to consolidate our Docker registries into a centralized registry. At the time, ECR did not provide a cross-account mechanism that was easy to manage.

So, given these details and several other factors, we decided to migrate to a self-managed Artifactory stack. Some of the features that we liked about it at the time were the large number of package types it supported and that it enabled us to create one large registry that included public images and our private images. It also allowed us to run within our firewall as a security measure.

Why do we want to move away from Artifactory?

Long story short, the first couple of years of managing Artifactory were painful and left a bad impression, which has stuck around. Overall, since we got some kinks worked out, most of our issues with Artifactory have been due to Artifactory’s inability to scale, as we are bound by the licenses granted by JFrog. This constraint only permitted us to scale vertically.

Every year, when we renewed our subscription with JFrog, we were required to upload new keys to license our product. Because of this, we would usually time our cluster upgrades to the one or two-day window of overlap our licenses gave us, as JFrog has been reluctant to provide us with temporary licenses to do any blue-green deployment of Artifactory, making any upgrades challenging to test.

We also need to update the infrastructure that our Artifactory service is running on. We recently moved away from Flux as our gitops tool onto Argo, so we also would need to migrate this service as well. Due to the size of Artifactory and the complexity of the stack, no one on our team was looking forward to dealing with this.

The other issue is that Artifactory is deprecating the current implementation of user API Keys, which we rely on internally for user authentication, and we would need to update all our workflows to support this.

That is why we are considering alternatives, AWS ECR being one of them. AWS has dramatically improved its feature set since we last relied on it five years ago. Since then, they have added features like OCI support, container image scanning, cross-account, and regional replication. They have also improved the ability to pull images from different accounts. Because we already use AWS as our cloud provider, it only made sense to look at their provided products before looking elsewhere.

So Why Not AWS ECR?

All-in-all, ECR worked great when we tested it, but there were a few glaring issues that, in the end, are preventing us from using it as an alternative to Artifactory.

Issue #1. Limited pull-through caching capabilities. ECR only provides three external registries to use as sources for pull-through caches: Amazon ECR public registry, quay.io, and registry.k8s.io. There is no support for pull-through from Dockerhub, although, by default, the core images on Dockerhub are already cached on ECR public. AWS has a ticket on their container roadmap that would enable pull through caching from registries that require authentication, so I assume this would include Dockerhub.

Issue #2. Repository creation is not automatic. While this was not a dealbreaker for us, it’s an annoyance. With Artifactory, whenever we had a new service to push images for, it was as simple as pushing the image to Artifactory, and Artifactory would automatically create the associated repository. With ECR, we would need to programmatically, probably with Terraform, create the repository before any new service would be able to push images.

Issue #3. OCI container support. This isn’t an issue with ECR. The fact that we can host OCI packages on it is great, as this would be our solution for our Helm charts. Unfortunately, the way we have our gitops configured, we rely heavily on Kustomize, which currently does not support pulling Helm charts from OCI repositories. However, there is now a PR in progress that will enable this. Then we would still need to wait until this gets released, and we can update Kustomize within Argo to use the new version. Ultimately, we could not use ECR to host our helm charts as we currently use them and would require another service to host them.

Given these issues and the considerable effort required to migrate all our container images and CI/CD workflows to ECR, we decided this switch was not the right decision.

Summary

So, in the end, we have decided not to switch to AWS ECR from Artifactory. While ECR has come a long way since we last used it, it still doesn’t have everything we need to make the switch.

After doing some testing with ECR, a couple of things we liked about it were its integration with IAM and how it makes cross-account repository access easy to configure. But unfortunately, these features aren’t enough to sway our opinion now.

We are still open to revisiting this down the road and investigating other possible replacements for Artifactory. However, we plan to avoid self-managed services in the future. We could better use the time required to manage them to do other things that support our infrastructure and serve our developers.

Update (October 30, 2023): Kustomize has merged in the PR, and Argo has upgraded its dependency. We will be testing the OCI integration soon.

Update (November 27, 2023): AWS has released pull-through caching support from DockerHub. This brings us closer to using it as a replacement for Artifactory. We are still waiting for support for automatic registry creation.

Enjoyed this read? Give it a round of applause (or a standing ovation) by clapping 👏! If you want more insightful content, hit that follow button for regular updates. Your support keeps the words flowing. Happy reading! 📚✨

--

--