How to use Kubernetes with Prefect: Part 3

Store your flow code in your Docker Image

Jeff Hale
The Prefect Blog
4 min readDec 7, 2022

--

snowy mountain at sunstet
Source: pixabay.com

In the first post in this series we saw how to use Prefect with Kubernetes without explicitly making a Prefect infrastructure block. In the second post we saw three ways to make a custom Kubernetes Job block. In all of our examples to this point we’ve stored our code in AWS S3 and used a custom S3 storage block to reference that code.

In this post we’ll see how to bake our flow code into a Docker image for our K8s job. With this method we don’t need to create a remote storage block or worry about connecting to a cloud provider! 🎉

We’ll continue using Prefect Cloud and running our agent and K8s cluster locally.

Custom Docker Image with flow code baked in 🧑‍🍳

In a folder, create the following files:

  1. flows.py
  2. requirements.txt
  3. k8s-block-flow-code.py
  4. Dockerfile

Let’s look at the file contents. You can see the files in this GitHub repo, too.

Our flows.py file looks the same as it did in our previous examples.

from prefect import flow, get_run_logger
from platform import node, platform


@flow
def check():
logger = get_run_logger()
logger.info(f"Network: {node()}. ✅")
logger.info(f"Instance: {platform()}. ✅")


if __name__ == "__main__":
check()

One again, the requirements.txt file has only s3fs in it.

We could create our K8s job block from the UI, but here’s the code to create it in Python:

from prefect.infrastructure import KubernetesJob

k8s_job = KubernetesJob(
image="discdiver/prefect:py3.10-baked",
image_pull_policy="Always",
)

k8s_job.save("k8s-flow", overwrite=True)

Note that our Dockerfile has one addition compared to our previous Dockerfile — we copy our flows.py file into the flows folder in our image on the last line.

FROM prefecthq/prefect:2-python3.10

COPY requirements.txt .

RUN pip install -r requirements.txt --trusted-host pypi.python.org --no-cache-dir

COPY flows.py flows/

Build and push Docker image

Let’s build our Docker image and push it to DockerHub. With Docker running, run the following commands —substitute your DockerHub username:

docker image build -t discdiver/prefect:py3.10-baked .

Log in, if needed, with docker login.

docker image push discdiver/prefect:py3.10-baked

Here’s my new image on DockerHub:

Jeff’s Dockerhub image showing tag name

You should see something similar.

Create K8s block

Let’s create our KubernetesJob block by running our Python script.

python k8s-block-flow-code.py

You can see your new block in the UI.

UI screenshot of block

Build deployment

Now let’s build and apply our deployment.

prefect deployment build flows.py:check -n k8s-baked -ib kubernetes-job/k8s-flow -a

Note that we didn’t have to specify a storage block using the -sb flag because our flow code was already baked into the container referenced in our KubernetesJob infrastructure block.

Now we can see our new deployment in the UI. 🎉

screenshot showing deployment name from the UI

Start agent

We didn’t specify a work queue when we created our deployment, so we’re rocking the default queue. 🎸

prefect agent start -q default

Run it!

prefect deployment run check/k8s-baked

If all goes well, you should see output like this in the Prefect UI:

Looks like my flow run got the game ludicrous-shrimp. 🦐

Wrap

In this post you’ve seen how to run your flow code locally in Kubernetes with code stored in your custom Docker image. Sweet! 🚀

I hope you found this guide to baking your flow code into a Docker image useful. If you did, please share it on your favorite social media so other people can find it, too. 🚀

Got questions? Reach out on Prefect’s 20,000+ member Community Slack.

In the next article in this series we’ll be moving to the cloud, so follow me to make sure you don’t miss it! 🙂

snow on steep mountain
Source:pixabay.com

Happy engineering!

--

--

Jeff Hale
The Prefect Blog

I write about data things. Follow me on Medium and join my Data Awesome mailing list to stay on top of the latest data tools and tips: https://dataawesome.com