Serving AI models on the edge — configuring k3s to use AWS ECR — Part 3
Introduction
This article is part 3 that shows how to configure k3s on the edge to use an external registry.
The first 2 parts focused on containerizing an AI model and deploying it in kubernetes.
The source for this article is here:
https://github.com/sparquelabs/ai-serving/tree/main/cogs/textgen-gpt2
Steps
Installing k3s
First, we install k3s. This is done using the following:
curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig-mode 644
# check if k3s service is running
sudo systemctl k3s status
# check if kubectl can be used
k3s kubectl get pods -A
Configuring k3s to use an external registry
Now, we configure k3s to use an external registry like AWS ECR. To do this,
export ECR_TOKEN=`aws ecr get-login-password --region "us-east-1"`
# add the ECR repo login
sudo cat<<EOF >> /tmp/registries.yaml
configs:
<your-registry-name>.dkr.ecr.us-east-1.amazonaws.com:
auth:
username: AWS
password: ${ECR_TOKEN}
EOF
sudo mv /tmp/registries.yaml /etc/rancher/k3s/registries.yaml
# force reload k3s
sudo systemctl force-reload k3s
# check if k3s configuration picked up the ECR configuration
sudo cat /var/lib/rancher/k3s/agent/etc/containerd/config.toml
...
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
...
[plugins."io.containerd.grpc.v1.cri".registry.configs."<your-registry-name>.dkr.ecr.us-east-1.amazonaws.com".auth]
username = "AWS"
password = "...."
Check if you can pull a docker image from ECR using k3s containerd. Doing this avoids later problems when deploying the pod because you would have verified ECR access for pulling images.
# use docker to push an image to ECR
aws ecr get-login-password --region "us-east-1" | docker login --username AWS --password-stdin 782340374253.dkr.ecr.us-east-1.amazonaws.com
docker tag local-image <your-ecr-registry>/<image-name>
# now check if we can pull the image using k3s containerd
k3s crictl list images
k3s crictl pull <your-ecr-registry-name>/<image-name>
Now, we are ready to deploy the AI model pod/deployment file as we saw in part 2.
Summary
As can be seen above, we have setup an edge kubernetes distro (k3s) to serve AI models on the edge.
With the above configuration, you can serve models on k3s using CPUs.
In the next part, we will see how to configure k3s to do the same using Nvidia GPUs.