The Prefect Blog
Published in

The Prefect Blog

How to use Kubernetes with Prefect

Part 1: Up and running

wooden captain’s wheel from a ship
Prefect logo

Why would you use K8s with Prefect?

If you’re already using Kubernetes to run your Python data engineering, ML training, or other backend workflows, you can benefit from all the observability and orchestration that Prefect provides.

Using infrastructure

Prefect makes it easy for you to run your dataflow code in a variety of infrastructure. By default, your code runs in a local subprocess on your machine. Alternatively, you can create a deployment and specify the infrastructure for your code to run in. Prefect provides pre-built infrastructure blocks for integrating with Docker, Kubernetes, AWS ECS, GCP Cloud Run, and Azure Container Instances.

Diagram showing how Prefect deployments with flow runs work

Use the default Kubernetes Job infrastructure

We’ll start off with a basic setup and iterate in future posts.

Setup

Install Docker Desktop and enable K8s

Let’s run Kubernetes locally. If you don’t have K8s installed, I suggest using the version that ships with Docker Desktop.

docker desktop menu to enable kubernetes

Download and install Prefect

In your Python virtual environment, install the latest version of Prefect with pip install -U prefect or use the version in this post with pip install prefect==2.6.8.

Set up Prefect Cloud

Sign up for a free Prefect Cloud account if you don’t have one yet.

form to create Prefect api key

Create flow code

We just want to demonstrate that things are working as expected. Let’s use some basic code that logs information about the network and instance. 🙂

from prefect import flow, get_run_logger
from platform import node, platform

@flow
def check():
logger = get_run_logger()
logger.info(f"Network: {node()}. ✅")
logger.info(f"Instance: {platform()}. ✅")

if __name__ == "__main__":
check()

Create AWS S3 bucket

Create an AWS S3 bucket with the default settings.

s3 bucket creation page

Set IAM User

For this tutorial, you could leave your bucket unsecured, but that’s not a great practice. Instead, let’s use the AWS credentials of an IAM user with access to interact with S3.

Create IAM user
Create access key

Create S3 remote storage block

You can create a Prefect block from the UI or Python code. Here’s how you can create an S3 block from the UI.

Create s3 block form in Prefect
Prefect s3 remote storage block form

Create your Deployment

Next we’ll build and apply our deployment from the command line. Alternatively, we could have defined our deployment in a Python file, as I showed here.

prefect deployment build flows.py:check -n k8sjob -sb s3/myawsblock  
-i kubernetes-job --override env.EXTRA_PIP_PACKAGES=s3fs -a

Results

Let’s see the results of creating our deployment. In the UI, click on Deployments and then click on check/k8sjob. Then click on the link to the anonymous infrastructure block that looks like this:

anonymous insfrastructure block example

Start your agent

Your agent will run locally, on your machine, polling your Prefect Cloud default work queue. Here’s the command to fire it up:

agent output screenshot

Schedule a flow run

We’ll create an ad hoc flow run. We could use the UI or the CLI. Let’s use the CLI. Open another terminal window and run the following command:

Flow run details

  • When our agent sees that there is a deployment scheduled to run in the work queue, it starts the flow run on our local K8s infrastructure.
  • The specified Prefect Docker image is pulled. You can see the image history in Docker Desktop.
  • The extra pip package s3fs we specified is downloaded and installed.
  • The K8s pod starts, the code runs, and the pod exits.
  • You can see the current pod status in your terminal with kubectl get pods.
  • And you can see all the details about your pod with kubectl describe pods <your pod name here>. My pod name was super-rabbit-97p8g-6wrvq .
deployment in the Prefect UI

Wrap

You’ve seen how to run your Prefect flows with Kubernetes without creating a custom infrastructure block.

row boat in mountain lake

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jeff Hale

I write about data science. Join my Data Awesome mailing list to stay on top of the latest data tools and tips: https://dataawesome.com