Plotly
Published in

Plotly

Molson Coors Streamlines Supply Planning Workflows with Databricks & Plotly Dash

Migrating from Excel using Databricks SQLAlchemy & Plotly Dash AG Grid to deliver editable data apps with write-back capabilities

TL;DR — For building production grade data apps using Plotly Dash and Databricks, consider using Databricks’ newly supported SQL Alchemy engine. And where the workflow includes tabular data, the recently open-sourced Dash AG Grid component is relevant. [Practical example will be covered in this blog.] Check out the Github repo & live app!

Introduction

Overview

  • Globally defined engine to read and write large amounts of data from Databricks.
  • You don’t need to open a cursor connection every time you want to read data from Databricks. Instead define a SQL alchemy engine using Databricks constants SERVER_HOSTNAME, HTTP_PATH, AND ACCESS_TOKEN then use the engine to write your SQL query directly to a pandas data frame using pandas “read_sql_query” statement.
  • The DBX SQL connector is suitable for thousands of records at a time and is slow compared to the SQL Alchemy ORM, where as many rows as necessary can be inserted.
  • Ability to manage DDL as objects in SQL Alchemy. Users can now define tables as Python class objects and manage their databases/schemas directly from their ORM using SQLAlchemy / Alembic.Use the same engine to bulk write large amounts of data back to Databricks via the pandas “to_sql” method, or use an ORM to create and drop tables all together.
  • Manage data by manually inserting records and modifying column null-ability.
  • The SQL alchemy engine includes arguments to manage and modify data both when it is being taken out of Databricks and when writing to Databricks databases. Specifically, users can nullify individual columns and manually add records as needed.
  • Seamless and complete integration with Dash AG Grid and other pandas compatible tools like Plotly’s Dashboard Engine.
  • Call data into a Dash AG Grid table using just one line of code. No need for extensive schema and column definition inside your dash app. The SQL alchemy engine automatically detects schema and assigns column headers on the grid.
  1. A Databricks workspace with DatabricksSQL enabled (DBSQL enabled in Premium Workspaces or above)
  2. A Serverless SQL Warehouse or Databricks cluster with 11.3 LTS or higher (data engineering cluster)
  3. A personal access token in Databricks to authenticate to a SQL Endpoint via API
  4. A Python development environment (>=v 3.8). We recommend VSCode for a local IDE and using conda or virtual env to manage dependencies, as well as black to automatically format your code
  5. Unity Catalog enabled account wide — Unity Catalog is Required for full Data warehousing functionality when using an ORM like SQLAlchemy
  1. Spin up a DatabricksSQL (DBSQL) endpoint on either classic or serverless
  2. Copy and paste the SQL code under utils/BuildBackendIoTDatabase.sql into the DBSQL Query Editor and run it. (Note: You can also run this code in a notebook directly from an imported repo in Databricks.)
  3. Clone the Git repo above into your local IDE.
  4. Install dependencies wherever you are running the Dash app with pip install -r requirements.txt in your IDE.
  5. Set environment variables of SERVER_HOSTNAME, HTTP_PATH, ACCESS_TOKEN from your Databricks cluster. You can find this by selecting the SQL endpoint and clicking the “Connection Details” tab in the endpoint UI.
  6. Run your Dash app on a local server by running python app.py

Building the Back End to the Dash App

Write Database Connection Functions using SQL Alchemy

“main” and “default” are automatically assigned as Catalog and Schema names when the Unity Catalogue is activated and the metastore created.

Set Up Plotly UI and Dash AG Grid

Use Callbacks to Enable Interactivity in Dash App:

Write Back to Databricks Database Using SQL Alchemy:

Adding SQL Alchemy to the Databricks SQL Connector for Python

Run the App

Next steps

--

--

Plotly is a data visualization company that makes it easy to build, test, and deploy beautiful interactive web apps, charts and graphs—in any programming language.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Plotly

The low-code framework for rapidly building interactive, scalable data apps in Python.