AWS CodeArtifact — An Introduction

Mark Faiers
Contino Engineering
5 min readMay 26, 2022

--

Setting the scene

Having recently evaluated AWS CodeArtifact for one of Contino’s clients, I thought I would share my thoughts on the service, including its main components, some suggested patterns of how it may be used within an organisation, and a comparison to some other popular services and options in the space, such as JFrog Artifactory, and a ‘do-it-yourself’ solution, using an S3 bucket as a package repository.

After reading this post you should have a good understanding of the constituent parts of CodeArtifact, and also be able to evaluate its suitability as a tool for the Developer, and DevOps teams in your organisation.

What is CodeArtifact?

If you are here already, you probably have some idea of what AWS CodeArtifact is but just to summarise, CodeArtifact is a software package management tool that can be used to store, and provide software packages and libraries for use in development, and in applications. It can also act as a mirror to popular software package repositories, such as npm, pip, and maven.

Building Blocks

CodeArtifact is made up of two main components, the Domain, and the Repository. A CodeArtifact Domain is actually very simple. It is really just a wrapper that allows you to group together Repositories. It does some clever things under-the-hood, like handling deduplication of packages across all Repositories that belong to the Domain, so that you are not paying multiple times to store the same package. It also gives you some central control over the Domain, and Repositories that belong to it via the Domain policy, which is just your run-of-the-mill JSON IAM-style policy.

Inheritance

CodeArtifact Repositories also support inheritance, in the form of ‘upstream connections’. What this means is that you can specify one Repository as being an upstream Repository of another, and so, if a developer, or a CI/CD pipeline requests a package that isn’t contained in the immediate repository, any upstream Repositories will then be searched. Similar to Domains, Repositories also have an attached policy, allowing access to each Repository to be managed independently while also retaining control at the Domain level. In a realistic example, you may have a central platform team managing the Domain, with separate development teams each managing their own Repository, linked to the Domain.

Connection to public package repositories

As well as specifying upstream connections to your own repositories, you can also specify popular public package management repositories, such as npm, pip, and maven, as ‘external connections’. This is an easy way to provide a wide range of packages to your development, and DevOps teams without taking on the management overhead of pulling and storing packages locally in a manual fashion, and managing versions yourself.

Patterns of Use — Private Repo (Air Gapped environments)

Courtesy of being a native AWS service that supports PrivateLink endpoints, CodeArtifact has the capacity to act as a repository that is accessed only over the private AWS backbone, completely closed off from the public internet. This may be important if you have particular security concerns, or compliance requirements to meet.

In this scenario you might have one CI/CD pipeline that builds and pushes a software package to CodeArtifact, via the VPC Endpoint, and another that pulls the package from CodeArtifact where it is required for use in another solution, and deploys that solution to a compute service, like EC2, or ECS. A diagram of what that might look like is shown below. Please note that for this solution endpoints are required for both CodeArtifact Repositories (com.amazonaws.<region>.codeartifact.repositories), and S3.

Patterns of Use — Proxy for public repos, with more auditability

As previously mentioned, CodeArtifact can act as a proxy for public package repositories, such as pip, and maven. CodeArtifact repositories can also specify other CodeArtifact repositories as an ‘upstream connection’, meaning that when a package is requested, if it is not found in the immediate repository, any upstream repositories are then searched.

Using an architecture like the one below, where you have separate repositories for each team, allows you to retain some degree of independence around package management, while also providing access control based on each team’s specific requirements.

A shared central repository with external connections to public repositories, also means that both the teams in this case, although it could be many more in reality, will have access to the public packages and libraries they require. Using a central shared repository, the organisation can also maintain a central point of auditability and access control.

Critical Evaluation

CodeArtifact uses S3 storage ‘under-the-hood’ but abstracts away a lot of the implementation details required for S3 to act as a package management tool. In theory you could roll your own tool using S3 buckets and policies. It might even end up costing less in storage costs, but the task of setting this up and maintaining it would be exponentially more given that you’d potentially have to manage things like de-duplication of packages, and versioning yourself.

At the other end of the scale, you might compare CodeArtifact to a 3rd party solution like JFrog Artifactory. It must be said that Artifactory is a lot more feature-rich than CodeArtifact. It has built-in support for security scanning, support for more binary types, and also containers. As you may expect, for this you can expect to pay more, a lot more. I did a cost comparison and you’d be paying only around $14/month for the same storage and processing in CodeArtifact that you get for $1199/month as part of the Artifactory SaaS Enterprise licence with added security pack. And it’s almost an order of magnitude more if you want to use the self-hosted version, when you take into account what you’d be paying for cloud resources to host it.

Wrap Up

So there you have it. CodeArtifact is very quick to get set up and start using, and also works out to be very cheap compared to other solutions on the market. ECR, used in conjunction with CodeArtifact, helps to support more scenarios but CodeArtifact still supports a limited range of artifact types, so be sure to check it meets your needs.

If you enjoyed this please check out contino.io, or our offerings in the AWS Marketplace to learn about what we do and how we may be able to help your organisation on its cloud journey.

--

--