Make a prediction every day with Serverless Machine Learning.

Lex Avstreikh
14 min readNov 28, 2022


A tutorial on how to create your own prediction service with open tools and Hopsworks.


Recently Feature Stores have burst onto the machine learning (ML) scene as the new centralized piece of infrastructure to organize, share, and use features. Until now though, it has been a tool almost exclusively available for enterprises with large infrastructure and extensive engineering teams. This year’s Feature Store Summit showed how large enterprises, from Uber, to Doordash create value and solve problems with feature stores.

Whilst innovation comes from those companies, the feature store is now also accessible to the wider audience of ML enthusiasts and practitioners. In earlier September 2022, the first modules of the Serverless ML course were released, where Jim Dowling, CEO and Co-Founder of Hopsworks, showed how to create a simple prediction service in less than 1 hour, from scratch, with free serverless infrastructure.

In this article, we will revisit the first lab in the Serverless ML course, the Iris Flower as a Serverless ML System, and use as an orchestration and scheduling tool to power the computation engine for the prediction system. Orchest offer a free plan that we will use, meaning you can still run your prediction service for free.

First: what is Serverless ML?

Serverless ML is a ML service that runs on fully managed cloud-based infrastructure. This removes the need for the developer to manage, deploy, secure, monitor, maintain, and upgrade the infrastructure powering their ML service, whether that service runs on a schedule (e.g., hourly) or the service is online (running 24x7).

Initial Setup:

We are going to use 3 serverless services to power our ML system. They are all open tools with unlimited free tiers — there won’t be any cost to run the prediction service(s) you build and run!

The prerequisites are:

Github will manage our source code and provide a serverless UI. Given that most of you already have an account, Github Pages is a simple way to host a UI page for free.

Orchest works as an orchestration tool and powers the compute engine of our service. We will be able to set up a flow of script/notebooks which we wish to run in order and on a specific schedule.

Finally, Hopsworks is a MLOps platform which will manage our features and model. Hopsworks will store the data (features) that we use to train our models and to power predictions made by our models (inference). Hopsworks will also store the models in its model registry.

Let’s dive in first to Github; by forking the serverless-ml repository : be sure to unselect the “copy the main branch only” as we will also fork the gh-pages branch where we will store the content for our Github Pages UI.

Once the repository forked, we can copy the url under the code button and head to

When the account is created we can create a new free instance, here we named our instance hops_serverless

We will now open this instance and import our forked Github repository.

There are a few things to setup in order for the environment to work properly; the API keys and tokens as environment variables and a couple of packages; those will be prompted in our notebooks later on and we just want to make sure that they are all installed correctly.

Let’s start with the packages, just head on the top right corner to the Environment tab and add the following to the shell script, press build and that should be pretty much it.

conda install twofish -y
pip install -U hopsworks — quiet
pip install pandas
pip install seaborn
pip install sklearn
pip install dataframe-image
pip install nbconvert
pip install pillow
pip install plotly
pip install lxml
pip install gitpython

Now the environment has been built, head to your project page (top left corner) and select the project settings:

You can enter new environment variables here, you will need two API keys: one for Hopsworks, and one for github. Since your Github account is likely already open let’s head back there and create a new token.

Under your personal account settings (not the repository), select the developer settings and personal access tokens; here we will generate a new classic access token. You only need to select the repo scope. Save the token someplace safe, as you won’t be able to retrieve it again, and head back to

We will call this environment variable GIT_TOKEN and save it.

We now need to create a Hopsworks account. First, go to

Register a new account. Once you have completed registration, the website redirects you to a quickstart page, we can skip it and directly jump into the platform.

Under the Account Settings, you will find an option to create a new API Key.

Simply create a new one, give it a name, we will choose Hops_Serverless with all the scopes. Be mindful once again to save it, as you won’t be able to see it again here either.

Back in Orchest, you can now enter the new variable; our previously created hops_serverless API key. Set HOPSWORKS_API_KEY as the environment variable name and the key itself as the value.

You should now have two environment variables set; one for Github and one for Hopsworks. Before starting you need to add two more; those of are your Github account name and the repositories name

It should end up looking like this; be mindful that those variables are case sensitive; if you do not set them properly there will likely be issues down stream.

All variables are set and your environment has been built with the necessary packages. We can start building our workflows.

Go to src>01-module>orchest under the project navigation, you can find the notebooks you are going to run:

  • clean-repository.ipnyb
  • Iris-train-pipeline-orchest.ipnyb
  • Iris-batch-inference-pipeline-orchest.ipnyb
  • push-work.ipnyb

Additionally we will use the iris-feature-pipeline.ipynb from the 01-module folder as it is.

As we ultimately won’t retrain a model each time new data comes in; we will leave that step out of the workflow for now and simply run it once (after the feature pipeline) to train our initial model and then store that first model. You run the train pipeline (or step) only if you need to improve or update your model.

You will create a second scheduled job where the model will be trained at a different cadence and we will alter the inference pipeline to always use the best available model in Hopsworks.

For now, your project in Orchest should look something similar to this:

We will come back to the two clean and push notebooks at the end; for now we will start building our initial flow; the model training will be deleted from this pipeline session and added to its own at a later stage.

01 — Iris Feature Pipeline

The Feature Pipeline notebook has two purposes: create the initial Feature Group with the BACKFILL data (the historical data), and generate new input data. The new data will be appended to the same Feature Group at each run of the notebook.

The new data is synthetic and generates a single new Iris flower; features will be randomly created each time and return a row in a DataFrame. This simulates the new data for our prediction service at each run.

First let’s run the BACKFILL, we can simply open the notebook by double clicking on it and set BACKFILL=True to get all the data from the iris.csv and create our initial iris Feature Group

import random
import pandas as pd
import hopsworks

Exploring the code a bit we will see further down how we connect to Hopsworks, this is done with a call to hopworks.login(); in the notebook it would prompt you with an input to insert the token; since we added it in the environment, it will automatically pick it up.

project = hopsworks.login()
fs = project.get_feature_store()
iris_fg = fs.get_or_create_feature_group(name=”iris”,
description=”Iris flower dataset”

Run the notebook via the pipeline view. Once done; make sure you set BACKFILL=False as this will also be your main notebook for the generation of new data.

Navigate to Hopsworks to see the Feature Group you just created.

You can explore further and look at the features that will be used to predict the variety of the iris flower depending on the sepal/petal length/width; this is also what the notebook will be generating at random at each scheduled run.

In the activity tab, you see the jobs with the number of added rows, the commit, and any errors.

Finally you can also view the data in the Data Preview and see some descriptive statistics in Feature Statistics: min, max, mean, std, and more.

Time to move to training the model.

02 — Iris Training Pipeline (Orchest Version)

We can now run the second notebook ,and load the Iris Flower dataset into random train/test splits using a Feature view and train a classifier model using the k-nearest neighbors algorithm in scikit-learn. This is a classification model which predicts which one of the three varieties of Irish flower it is, Setosa, Virginica, or Versicolor, based on the input features (the sepal/petal length/width).

While it runs let’s have a look at the notebook itself ; the Feature View will have 5 columns; 4 will be the features (sepal and petal length and width) and the final column will be the label (or target), the variety of the flower.

project = hopsworks.login()
fs = project.get_feature_store()
feature_view = fs.get_feature_view(name=”iris”, version=1)
iris_fg = fs.get_feature_group(name=”iris”, version=1)
query = iris_fg.select_all()
feature_view = fs.create_feature_view(name=”iris”,
description=”Read from Iris flower dataset”,

Once the Feature View and the dataset is created, we will train our model, it will now also be versioned and available for downloading. If we train a new version of the model, we can later update our inference pipeline to specify which model version to download and use.

In our case, as we want to select the best model at each run; we will comment the version number. this will incrementally add a new version at each training instead of replacing the current model. Note that we also register a metric: here accuracy; this is what you will target for the selection of the best model in the inference pipeline.

iris_model = mr.python.create_model(
# version=1, #removing version to incrementally create a new version at each training
metrics={“accuracy” : metrics[‘accuracy’]},
description=”Iris Flower Predictor”)

As our scheduled feature pipelines periodically insert new rows into the Feature Group the dataset increases in size. Over time, with more data, you can retrain models to improve model performance.

In Hopsworks, we can also see the Feature View and which Feature Group it was generated from. A Feature View is created from multiple Feature Groups joined together. In our case the requirements are quite simple, the features for our Feature View come from only a single Feature Group.

03 — Batch Inference Pipeline (Orchest Version)

This is the final notebook for the purely ML related part of the workflow. Here, we will load the batch inference data that arrived in the previous 24 hours, and predict the variety of the first Iris Flower found in it using the best trained model available.

Ultimately this will also output the prediction, the flower and a few other items in png; which we will later on push to our github repository to update our UI.

In the notebook, we can first connect then prepare to use our model; here you can see we commented out the model = mr.get_model(“iris”, version=1) as we do not want the version 1 of the model, instead in the next row we will use get_best_model and specify the metric (accuracy) and which direction (max). In Hopsworks we can look at the model

import pandas as pd
import hopsworks
import joblib
project = hopsworks.login()
fs = project.get_feature_store()
mr = project.get_model_registry()
# model = mr.get_model(“iris”, version=1) # selecting a specific model
model = mr.get_best_model(“iris”,’accuracy’, ‘max’) # selecting the best model for accuracy
model_dir =
model = joblib.load(model_dir + “/iris_model.pkl”)

Calling the Feature View object we created earlier in the training notebook we will load all the data use for the scoring;

feature_view = fs.get_feature_view(name=”iris”, version=1)

The prediction happens in the next cells as y_pred and it will be done on the last flower that has been entered in the Feature Group.

start_ts = ( — datetime.timedelta(hours=24))
batch_data = feature_view.get_batch_data(start_time=start_ts)
y_pred = model.predict(batch_data)
flower = y_pred[y_pred.size-1]
label = df.iloc[-1][“variety”]

In the rest of the notebook we will save the flower in your assets folder as a png along with the actual flower and a confusion matrix; if it has all three varieties, else it will prompt that you should run the inference pipeline more times until you get the 3 different varieties.

This concludes the main part of the ML pipelines, from data generation, insert, training and inference; the lecture and lab videos have much more details on those respective aspects and you can view them on our youtube channel here.

Setup of the the Pipelines & UI

You will now finalize your project to run your pipelines on a schedule and update your project’s Github Page. In your repository, go to the gh-pages branch; this is where your Github Page (the UI that will be hosting the results of the model) will be.

Open the file and change the repository from ‘featurestoreorg’ to point to your user name. You want the model to take your outputs, not the ones from featurestoreorg’s repository. In my case, I will replace all featurestoreorg to be my username magiclex and get the image from my gh-pages branch.

Commit your changes and get back to your main branch.

Under the Settings in Pages you will see that the branch is already selected; and in fact it is being built as you just updated the file. You can select a different branch if you alter your project but by default the page will be built automatically as you forked the main project.

Under the action tab you will see your action; the page being built. You won’t actually need to do anything as this is triggered by github pages upon a new update in this branch.


If you go to https://{ your user name}, you should now see the page with the predictions.

Let’s now make it a scheduled service. As mentioned before the final workflow will look like this:

We removed the training pipeline as we won’t use it at the same cadence.

The code of the two Github notebooks is much simpler; as you already set your account, token, repository names as environment variables earlier, you won’t have to worry about it here. We simply set up the local project directory as our local repository path and connect to the remote.

# Setup 
full_local_path = “/project-dir/”
repo = git.Repo(‘/project-dir/’)
remote = f”https://{secret}{account}/{repo_url}.git"
repo = Repo(full_local_path)
origin = repo.remote(name=”origin”) 
if origin.url != remote:
origin.set_url(remote, origin.url)

We then force checkout to both branches to be up to date.

repo.git.checkout(‘gh-pages’, force=True)
# Going back to the main branch
repo.git.checkout(‘main’, force=True)

As the workflow continues toward the creation of a new feature, then the batch inference; new files are produced. The last notebook will take those files and push them to our remote gh-pages branch.

# move the files to the /data folder in orchest
assets_folder = r”../../../assets/”
env_folder = r”/data/”
files_to_move = [‘latest_iris.png’, ‘actual_iris.png’, ‘confusion_matrix.png’,’df_recent.png’]
for file in files_to_move:
# construct full file path
source = assets_folder + file
destination = env_folder + file
# move file
shutil.move(source, destination)

We use the /data folder in our orchest environment as a temporary storage for the files while we switch branches.

# move to the branch for pages
repo.git.checkout(‘gh-pages’, force=True)
#move back to an asset folder in the gh-pages branch 
for file in files_to_move:
# construct full file path
source = env_folder + file
destination = assets_folder + file
# move file
shutil.move(source, destination)
# Add our file, and set our commit
repo.git.add(‘assets/latest_iris.png’, ‘assets/actual_iris.png’, ‘assets/confusion_matrix.png’, ‘assets/df_recent.png’)
current =
repo.index.commit(f’New prediction! time and date: {current}’)
# Push to the pages repository

Once on the gh-pages, we will commit and push those files to our remote branch. As there are new files to the repository, Github will rebuild a new page with the new content.

repo.git.checkout(‘main’, force=True)

We move back to the main local branch for the workflow to start properly on the next iteration.

You are done! You can go to your job and create a recurring schedule to run daily; your prediction service is now in production; it will run every day on a schedule!

You can also set a new pipeline in Orchest containing solely the training notebook.

Head back to the job and scheduled to run once a week.

Wrapping things up

What we have done here; is create a prediction service, where the feature and batch inference pipelines are run daily at a specified time on serverless infrastructure. You did not first have to write a Dockerfile, deploy to kubernetes, run terraform. And did not have to pay anything to deploy this (small) project to production.

Where can you go from here: as this is obviously a very simple project, but it gives you an idea of the serverless ML framework without adding too much complexity. You could set up your pipeline to run more frequently, every hour, or minutes… and use much bigger datasets for your historical data. Improve your UI perhaps or the model performances.

Hopefully this will inspire you to create your own prediction service, and if so, let us know!

In the meantime, throw us some github stars;

Do not hesitate to join us on slack; and if you are interested in more complex aspects of a serverless infrastructure; we are still running the serverless-ml free course, from which this tutorial takes most of its work from.