Creating a simple Dashboard with LakeView

Jason Drew
DBSQL SME Engineering
7 min readOct 26, 2023

Author: Jason Drew

Intro:

This article will show how to build a dataset in Databricks LakeView and create a Dashboard with almost no code to get BI users off and running quickly! If you want to understand the details and differences between Classic DBSQL Dashboards and Lakeview Dashboards, check out the article Classic vs Lakeview Dashboards. Here I’ll explain how I created both my dataset and the dashboard referenced in that article.

A core difference and feature in Lakeview is that you can now create “gold” data models that are specific to a Lakeview Dashboard. This allows BI users to create other gold tables, views, and SQL queries that all align to a specific Dashboard whereas previously, the DBSQL Dashboards were query based and not as unified to develop and manage for BI. This dashboard-centric model is much more similar to the way Tableau and Power BI allows BI users to create a data model for a particular dashboard report and save the model with it.

In this article, we are going to get a raw dataset, clean it up in the Databricks UI, and build a Lakeview Dashboard with the following steps:

  1. Create a schema for data
  2. Import data from files
  3. Transform data and create tables
  4. Build a data model in the Data tab of a Lakeview Dashboard
  5. Add Visualizations and create our Lakeview Dashboard

Building the data model:

Perhaps I’m showing my age here, but when contemplating a simple dataset that would be good for basic dash-boarding, the Northwinds Database immediately came to mind. For our younger viewers, this was a simple training database that Microsoft originally provided for Access 97 so you could practice writing basic queries against a simple but somewhat real world dataset. A simple google search found me a github link that had all the tables available in csv downloads. Now while there have been subsequent versions and updates to this schema, the original is shown below, and I highlighted the tables and fields I’ll be using.

Step 1: Creating the schema where the tables will live

After downloading the .csv files for Categories, Products, Order Details, Orders and Customers, create a new schema in Unity Catalog called Northwinds.

In Catalog Explorer, select a Unity Catalog, then click Create schema.

Step 2 — Import raw data with Databricks UI

Next, we navigate back to the schema and click “+Add” to import my new tables into my new schema. We then select the option “Add data”. This is by far the simplest way to get data into Databricks, but there are many other ways such as uploading to a source volume, using any Databricks Partner, or coding up your own ingestion pipeline. The Data Import UI allows any user to do ad-hoc self serve data import and BI with ease on Databricks without requiring a data engineering team at all.

Click + Add then Add Data
Click Create or modify table
Click and drag the .csv file to upload

Step 3 — Transform data to clean tables with Databricks UI

Now that we have imported the file, we can make edits to tables on the fly in the Databricks UI. The below screenshots walk through the process:

Select the main Catalog and northwinds Schema. Exclude the description and picture columns. Click Create Table

Pro tip: in the Databricks Import UI, you can perform transformations like column exclusions, data type conversions, and more all without writing any code. A few examples are shown in the below screenshots.

Product: Only keep productID, productName and categoryID columns
Order Details: Keep all columns
Orders: Excluding everything but orderID, customerID, orderDate, requiredDate and shippedDate. Changing the datatype of orderDate, requiredDate and shippedDate from timestamp to date.
Customers: Exclude all columns but customerID and companyName
table example in Unity Catalog

When the uploads are all complete, all tables should be in our unity catalog schema we created. Now we can create our Lakeview Dashboard!

Step 4 — Build Lakeview Dataset

Now that we have our data uploaded to our schema and properly transformed into tables, we can navigate to the “Dashboard” section of our Databricks workspace and click on “Create Lakeview dashboard”.

Once the dashboard is created, we can navigate to the “Data” tab of the Lakview dashbaord. The data tab allows us to pull in tables or write customer tables in SQL. We could have created a view for this example and simply selecting it from the table button, but instead let’s show how to make the dashboard-specific dataset within the dashboard data tab with SQL. This is similar to creating a “gold” layer in the Medallion architecture, but instead the layer is purpose-built for the core dashboard use case.

To create a new LakeView Dashboard click Dashboards → Lakeview Dashboards → Create Lakeview dashboard
Click the Data Tab and click Create from SQL. Name your Dataset All Orders. I grabbed the needed fields renaming ProductName to Product and CompanyName to Customer. I also created a simple calculated field for Revenue.
Create a second Dataset named Unfulfilled Orders, which is the same as the first except we add a where clause at the end to just get rows where ShippedDate is null.

Step 5 — Dashboard Creation

Now that our datasets have been created, let’s start building our dashboard. We started out with an image that we need to upload to the FileStore of DBFS. You’ll need to have DBFS File Browser Enabled under Admin Settings to do this. This will let you browse DBFS from Catalog Explorer and upload an image file.

From Admin Settings → Workplace Settings → Advanced, DBFS File Browser needs to be Enabled.
Once DBFS File Browser is enabled, you can upload a file to FileStore

Back in the Canvas of the new Lakeview Dashboard, we add the first widget which is a text box. By using the syntax ![test image](files/northwinds.png) We can directly reference the image uploaded to DBFS.

After resizing, weadded another text box with an # to bold my title. These textboxes support Markdown.

Next lets add our first visualization. Select a Bar chart with OrderDate for the X axis, Revenue for the Y axis and CategoryName for the Group by.

Now lets add the first filter. Choose a Dropdown (multi-select) and select this to be on the CategoryName of both the All Orders and Unfulfilled Orders Datasets. Give it a simple description of “Categories”.

Repeat this for a second filter on Products shown below:

My personal favorite thing about LakeView dashboards is the easy ability to set the aggregation on values, especially with Counters which are great for displaying KPIs. Here we added one called “Unfulfilled Revenue”, which is the Sum of Revenue from our Unfulfilled Orders Dataset.

We then repeat this except for Total Revenue which of course is from the All Orders Dataset as well as a count of OrderIDs to represent all Orders.

We can then insert a table visualization off of the All Orders dataset showing OrderID, Customer, Quantity, Revenue, OrderDate, RequiredDate and Product. To order the columns you simply click and drag them in the order you wish them to appear. We can also formatt the Revenue column with a $ with the “Number format” option in the visualization editor.

Finally, let’s add a Scatter visualization off of the All Orders Dataset with Product on the X Axis, Revenue on the Y Axis and Discount as the color.

After a little resizing the last thing to do is name and publish the dashboard!

Result:

Once published you can see what the end user will experience. Notice that filters affect both datasets, correctly adjusting our Unfulfilled Revenue as well as cascading our Product options based on our Category selection(s).

Databricks Lakeview Dashboards are just getting started for making BI on any data size simple and easy, and all in one platform!

--

--