Getting started with Databricks Feature Serving

Published in

Marvelous MLOps

5 min readJun 23, 2024

Databricks has recently announced the general availability of the Feature Serving in March 2024. This functionality is useful when exposing the data available on Databricks to other applications.

To make it a bit more clear when you need Feature Serving vs Model Serving, see a diagram below.

In the case of machine learning, Feature Serving can be used to expose your pre-computed predictions. For example, user recommendations for an online store that get generated in a batch process daily/weekly depending on your business needs. This is the use case that we are going to explore.

Remember, even though Feature Serving is in GA, it uses Online Tables, and that feature is still in Public Preview. Also, use databricks-sdk version 0.28.0 or higher to run the code. It is currently in Beta.

High-level overview

Let’s start with the high-level overview, and then go through the process step-by-step.

In this article, we are not going to explain how Databricks workflows work and how batch recommendations are generated. We just take a recommendations dataframe as given.
We store the recommendations in a feature table in Unity Catalog.
We create an online table, which is a read-only copy of your feature table designed for fast data lookups. According to Databricks documentation, “online tables provide low latency and high throughput access to data of any scale”. Feature Serving does not work without online tables.
We create a FeatureSpec is a user-defined set of features and functions, and this is what is served behind the Feature serving endpoint.
In the end, we deploy the feature serving endpoint and query it to retrieve the predictions.

Recommenations dataframe

Let’s mimic the recommendations dataframe using the online retail dataset available on your Databricks workspace.

import pyspark.sql.functions as F
from pyspark.sql import Window

df = spark.read.load("/databricks-datasets/online_retail/data-001/data.csv",format="csv",sep=",",inferSchema="true",header="true" )
df = df[['CustomerID', 'StockCode']].filter(df['CustomerId'].isNotNull()).drop_duplicates()
w = Window.partitionBy('CustomerId').orderBy('StockCode')
df = df.withColumn('rank', F.rank().over(w))
df = df.filter(df.rank <=5).withColumnRenamed('StockCode', 'ProductId')
df = df.groupby("CustomerID").agg(F.collect_list("ProductId")).withColumnRenamed("collect_list(ProductId)", "Recommendations")

For each CustomerID, we have a list of product ids available as recommendations. This is what the dataframe looks like:

Create a feature table and an online table

Both feature and online table are stored in Unity Catalog. To run the commands, catalog “mlops_test” and schema “feature_serving” must exist and you must have USE privileges on both. Schema and catalog can be created by users with CREATE privileges on both.

from databricks import feature_engineering

catalog_name = "mlops_test"
schema_name = "feature_serving"fe = feature_engineering.FeatureEngineeringClient()
feature_table_name = f"{catalog_name}.{schema_name}.user_recommendations"
online_table_name = f"{catalog_name}.{schema_name}.user_recommendations_online"fe.create_table(
  name = feature_table_name,
  primary_keys=["CustomerID"],
  df = df1,
  description = "User recommendations"
)

With “create_table” method, we create a new feature table. To update the existing feature table, use “write_table” method instead.

After the feature table is created, we need to generate an online table. The commands below will only run with databricks-sdk version 0.28.0 or higher.

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.catalog import OnlineTableSpec, OnlineTableSpecTriggeredSchedulingPolicy
import mlflow

workspace = WorkspaceClient()# Create an online table
spec = OnlineTableSpec(
  primary_key_columns = ["CustomerID"],
  source_table_full_name = feature_table_name,
  run_triggered=OnlineTableSpecTriggeredSchedulingPolicy.from_dict({'triggered': 'true'}),
  perform_full_copy=True)online_table_pipeline = workspace.online_tables.create(name=online_table_name, spec=spec)

In the example above, the online table can be updated using a trigger. online_table_pipeline object contains pipeline id (in our case, ‘f6bn131e-da4e-4431-abc7–4ae189c28001’), which can be used to trigger an update later:

w.pipelines.start_update(pipeline_id='f6bn131e-da4e-4431-abc7–4ae189c28001', full_refresh=True)

If the online table is not updated, it will not reflect changes in the feature table.

The online table can also be updated continuously if it is created with different specifications (run_continuously=OnlineTableSpecContinuousSchedulingPolicy.from_dict({‘continuous’: ‘true’}). Also, change data feed must be enabled, you can do it by running the following SQL commands on your feature table:

ALTER TABLE myDeltaTable SET TBLPROPERTIES (delta.enableChangeDataFeed = true)

This is unclear what consequences it will have for the costs, but I would assume a triggered update is cheaper.

Create FeatureSpec

FeatureSpec is a user-defined set of features and functions that is served behind the Feature serving endpoint.

In our scenario, we just need to use feature lookup.

from databricks.feature_engineering import FeatureLookup
features=[
  FeatureLookup(
    table_name=feature_table_name,
    lookup_key="CustomerID",
  )
]
feature_spec_name = f"{catalog_name}.{schema_name}.recommendation"

fe.create_feature_spec(name=feature_spec_name, features=features, exclude_columns=None)

FeatureSpec is also stored in Unity Catalog and can be viewed via UI under “Functions”. Note: you can see both the feature table and the online table under “Tables”.

Deploy a Feature Serving endpoint and send a request

To create the Feature Serving endpoint, you need to specify a served entity, which is FeatureSpec defined above.

from databricks.sdk.service.serving import EndpointCoreConfigInput, ServedEntityInput

endpoint_name = "recommendations-endpoint"# Create endpoint
workspace.serving_endpoints.create(
  name=f"{endpoint_name}",
  config = EndpointCoreConfigInput(
    served_entities=[
    ServedEntityInput(
        entity_name=f"{feature_spec_name}",
        scale_to_zero_enabled=True,
        workload_size="Small"
      )
    ]
  )
)

After the endpoint is deployed, we can send our first request:

import mlflow

client = mlflow.deployments.get_deploy_client("databricks")
response = client.predict(
   endpoint=endpoint_name,
   inputs={
       "dataframe_records": [
           {"CustomerID": 12356},
       ]
   },
)

This is the output we get:

Important to mention that we do not have to update the endpoint every time we update the online table. The endpoint must be updated only if the underlying FeatureSpec changes.

To send the request to the endpoint in a production setting on Azure, use the service principal’s Microsoft Entra ID access token to access the Databricks REST API! https://learn.microsoft.com/en-us/azure/databricks/dev-tools/service-prin-aad-token

Conclusions and follow-up

Feature Serving is a great feature that enables to serve batch predictions. In such a scenario, model serving would be an overkill.

As in the case of model serving, you can not define the payload in any way you like. In the model serving scenario, you have a bit more influence on how the payload looks than in the feature serving scenario. For existing integrations, migrating to the Feature Serving endpoints, a change is required.

In the follow-up articles, we will explore:

Use cases and architectures for Feature Serving vs Model Serving deep-dive.
How Feature Serving endpoints can be used in scenarios where we want to trigger long-running tasks via an API call, and then retrieve the results via another API call (something you can achieve with FastAPI + Celery if using open-source solutions).
How we can monitor endpoints using inference tables