Photo by Patrick Tomasso on Unsplash

Deploy Code to AWS Kubernetes with Prefect: a Step-by-Step Walkthrough

Seamlessly transition from local development to the cloud

Kingsley
Published in
8 min readJun 17, 2020

--

Prefect’s innovative Hybrid model allows users to leverage Prefect Cloud’s API for workflow orchestration and monitoring from within any execution environment.

By following the steps in this blog post, you’ll be able to build and run a Prefect flow on your local machine and then with a few extra steps have your flow running in an AWS-hosted Kubernetes cluster!

Prerequisites

  1. AWS account and AWS CLI downloaded and authenticated
  2. Docker installed
  3. Kubernetes command-line tool, kubectl, installed
  4. Prefect Cloud account

Before we get started, please make sure you have a Prefect Cloud account, have configured your Prefect environment on the machine that you’ll be working with and have at a minimum registered and run a flow with Prefect Cloud.

Prefect’s native support for Kubernetes with a dedicated agent makes the transition from developing workflows locally to running jobs in a Kubernetes cluster very seamless. However, for users who are not as familiar with Kubernetes the setup can sometimes seem daunting. This blog post will connect some of the dots for experiencing that transition for yourself.

Before we write our Prefect flow, we need to create a Kubernetes cluster and a Docker repository to save a containerized version of our flow. For this AWS example, we’re going to use Prefect’s Docker Storage that will put the flow inside a Docker image and push it to Elastic Container Registry (ECR). Elastic Kubernetes Service (EKS) will then pull the image from this repository to create and run the jobs.

Let’s get started!

  1. To create a repository, login to the AWS console and navigate to the ECR console.

Note: If this is your first time creating a Kubernetes cluster, I recommend logging in to the console with the same IAM user that you intend to use for programmatic access from your terminal window in the later steps. This will avoid any issues that arise around role permissions to the cluster.

Make a copy of the image URI as you will need this in step 6 and step 9.

2. Next, we’ll create the Kubernetes cluster and worker nodes.

In the first configuration screen you’ll need to create a Cluster Service role, if one does not already exist. This can be done in the IAM console, and creating a role using the EKS service and then selecting the EKS-Cluster use case.

To do so, navigate to the IAM console and create a new role. In the first screen, select EKS from the list of use cases available and then EKS-Cluster. Then follow the next buttons to create the role.

Continuing on with your cluster configuration, the following screens will ask you to set up networking and cluster endpoint access. This will vary depending on your infrastructure set up, so these configurations are very much up to you.

Note: If you allow public access to the cluster endpoint, by default, it will be open to all traffic. In the advanced settings there is an option to restrict this by IP.

The network configuration will impact how the agent on your dev machine can communicate with the cluster, i.e. depending on how you are connected to your VPC, it may need to access the cluster over the public internet (see Step 10). Also, the cluster will need to have an outgoing route to the internet so that in the final step when you run the Kubernetes agent in the cluster itself, it will be able to communicate with the Prefect Cloud API to poll for any flows to run (see Step 11). This configuration may also impact any other services you may want to communicate with when you build more complex flows that depend on third parties or other AWS services.

Next, you’ll be able to specify logging options and finally review and create.

3. Once your cluster is created you’ll be able to set up your compute, by clicking Add Node Group as seen in the screen below.

In this configuration, you’ll need to have an IAM role that will be used by the nodes. This role needs to be able to have the following policies

  • AmazonEKSWorkerNodePolicy
  • AmazonEKS_CNI_Policy
  • AmazonEC2ContainerRegistryReadOnly

These policies are available under the common use cases from the EC2 list.

The remaining configuration is based on your networking and remote access preferences.

Note: if you allow remote access to your nodes, you can choose which source IP ranges you want to allow traffic from. If you select all, it will create a security group with ssh access from 0.0.0.0/0, meaning this will allow traffic from any machine in the world.

Then finally you’ll be asked to choose your compute and scaling configuration prior to reviewing and creating.

Before leaving the AWS console, make sure the NodeInstanceRole that you created and attached to your compute has the correct permissions set up on your ECR repository.

4. Add permissions to your ECR repository

Click “Edit Policy JSON” button in the top right corner and add the following JSON to the permissions.

{
"Version": "2008-10-17",
"Statement": [
{
"Sid": "NodeInstanceRoleToECR",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::<AWS ACCOUNT NUMBER>:role/<NODE INSTANCE ROLE>"
},
"Action": [
"ecr:BatchCheckLayerAvailability",
"ecr:BatchGetImage",
"ecr:GetAuthorizationToken",
"ecr:GetDownloadUrlForLayer"
]
}
]
}

Now your Kubernetes cluster and docker storage repository are ready!

As mentioned earlier in the pre requisites, it’s expected that you have a Prefect Cloud account and have at a minimum run a first flow with Cloud’s API. For a tutorial on configuring your environment, and running your first flow with Cloud check out our docs here.

5. Authenticate your machine with your Kubernetes cluster.

>> aws eks get-token --cluster-name <YOUR_CLUSTER_NAME>

6. Next, let’s make sure your docker client is authenticated to push to Amazon ECR.

# run this command to generate the token
>> aws ecr get-login-password --region <YOUR_REGION>
# copy token from the result of command above
>> docker login -u AWS -p <COPIED_TOKEN> <IMAGE_URI>

7. Lastly, you’ll need a kubeconfig file on your machine

In practice, the Kubernetes agent is usually deployed within a Kubernetes cluster and will run flows in the cluster. However, with a kubeconfig file you’ll be able to run flows from your dev machine with a Kubernetes agent running locally and also deploy the Kubernetes agent from your local machine to the cluster, with a simple command. See step 11.

>> aws eks --region <YOUR_REGION> update-kubeconfig --name <YOUR_CLUSTER_NAME>

Withkubectl installed you’ll be able to verify the connection to your cluster with…

>> kubectl get svc

8 .The next thing we’ll do is create a project in Prefect Cloud.

This can be done via the UI or the Prefect CLI.

With the CLI:

>> prefect create project “EKS”

Or with the UI, click the hamburger button in the top left, and select Team. Select Team Settings, followed by Projects. And add your new project.

9. Now, you can create and run your first flow

import prefect
from prefect import task, Flow
from prefect.environments.storage import Docker

@task
def hello_task():
logger = prefect.context.get("logger")
logger.info("Hello, Kubernetes!")

flow = Flow("hello-k8s", tasks=[hello_task])

flow.storage = Docker(registry_url="<your-registry.io>")

flow.register(project_name="EKS")

10. To run on your local machine

To run the agent you will need a Runner Token. This can be created in the UI, in Team Settings > API Tokens > Create Token (scope RUNNER) or with the Prefect CLI, see below.

# create runner token 
>> prefect auth create-token -n my-runner-token -r RUNNER
# store it as an environment variable for future steps
>> export PREFECT__CLOUD__AGENT__AUTH_TOKEN=<COPIED_RUNNER_TOKEN>
# make sure you have installed prefect with Kubernetes extra
# pip install "prefect[kubernetes]"
>> prefect agent start kubernetes

Note: As mentioned in step 2, this agent will need access to the Kubernetes Cluster in AWS from the machine that it’s running on.

Next, login to your Cloud account and select your project from the dropdown.

Select the flow in the list.

Navigate to the Run screen and hit run…

You’ll then be able to view the run state of your flow during and after.

11. To deploy an agent to the Kubernetes cluster

This step is the perfect example of how Prefect really simplifies the process of deploying and executing your flows in a Kubernetes cluster. The Kubernetes agent requires Role Based Access Controls (RBAC) to inform which jobs it has access to in its network. The Prefect CLI automatically attaches this Role and RoleBinding to the Agent deployment YAML.

# Stop your agent that is running locally and ...>> prefect agent install kubernetes -t <YOUR_RUNNER_TOKEN> --rbac | kubectl apply -f -

Note: As mentioned in step 2, this agent will need to be able to access the public internet from the Kubernetes cluster to communicate with the Prefect Cloud API.

And repeat steps to run your flow from Prefect Cloud.

You’re done! Explore more at prefect.io.

Please continue reaching out to us with your questions and feedback — we appreciate the opportunity to work with all of you!

Happy Engineering!

— The Prefect Team

--

--