Effective ML Workflows

Integrating Dash Enterprise within the ML Lifecycle

Elliot Gunn
Plotly
6 min readMar 28, 2022

--

Most data science projects fail to make it to production. We can increase the likelihood of a successful outcome by building enterprise-grade Dash apps to visualize the data in every stage of the ML lifecycle, including: exploratory data analysis, explainability, feature development, model deployment, and model monitoring.

By publishing analytic apps on each stage of the ML lifecycle, data scientists can give stakeholders access to intermediate insights and opportunities to realign objectives. Models can be presented to stakeholders as interactive web apps rather than APIs or notebooks.

The Dash Enterprise platform empowers data scientists to own the entire ML lifecycle. They can transform Jupyter notebooks and Python scripts into interactive web applications and dashboards that enable actionable insight throughout the project’s lifecycle.

Doing exploratory data analysis with Dash

Quickly perform exploratory data analysis (EDA) of the raw data with our open-source visualization library, Plotly. Data scientists can read data from any source via Python, clean and transform features, create basic visualizations, and view them through Workspaces, Dash Enterprise’s built-in IDE that is immediately accessible via the browser (nothing to download!) but runs on a server within your secure corporate firewall. Teams can develop in a standardized production environment with access to Jupyter, a terminal, and network access to your organization’s databases and APIs. Within a Dash Enterprise workspace, data scientists can visualize data with Plotly’s graphing libraries in Jupyter Notebooks and rapidly publish internal Dash apps for the stakeholders, offering tight feedback loops when exploring data and building models.

Workspaces, Dash Enterprise’s built-in IDE that is accessible via the browser
Cross-filter the data against key features (e.g. year, location, category) that may be important for feature engineering when creating a predictive sales model

Building apps on top of AI & ML APIs

Sophisticated AI & ML projects built on platforms like DataRobot, Prefect, and Databricks, or a homegrown platform will often be published as an API.

With Dash Enterprise, data scientists can rapidly build Python web apps that call these APIs and provide stakeholders with a rich user interface for interacting with the model. Since data scientists know the data and model inside and out, they are the best suited team for building the app and presenting it to their stakeholders. By owning the model and the app, data scientists can rapidly incorporate stakeholder’s feedback.

Visualizing data from a model, connected to an external model API

Leveraging open source analytics libraries

Not all machine learning models are deployed as sophisticated APIs. Many of our customers build significant value by incorporating open source libraries, like scikit-learn, scipy, statsmodels, prophet, or their own numerical methods, directly into their apps and dashboards. These methods run on-the-fly within a Dash callback. Data scientists can use machine learning to perform segmentation of an image, run YOLO v3 for real-time object detection, and perform natural language processing.

Use sciki-learn’s t-SNE tool to visualize high-dimensional bank customer complaint data and interactively explore the spatial distribution in NLP analysis.

Understanding model explainability and feature development

Once a model has been created and tuned, data scientists can use libraries like SHAP to better understand the outputs of machine learning models, which largely remain black boxes to AI developers. SHAP values help quantify the importance of a feature in a model, lending some explainability to complicated black box algorithms and allowing for further fine-tuning. Data scientists can visualize SHAP values within Dash App, and also visualize any changes to predicted outputs immediately through interacting with the defined controls (e.g. features).

By deploying Dash apps that incorporate SHAP, explainability can become part of production, allowing stakeholders to continuously monitor the model’s drift over time as the data evolves.

Using SHAP to understand the impact of individual features on the predicted amount of tips received by waiters (read more about this app here)

Deploying apps within the firewall

Successful machine learning projects can be accompanied by several apps:

  • Exploratory data analysis apps for each dataset
  • A control panel for running and exploring the model API
  • Explanatory models for each model
  • Monitoring apps to keep tabs on the model, drift, and the downstream business impacts

Dash Enterprise enables data scientists to rapidly build and deploy these applications within their organization’s firewall.

The IT department installs and configures Dash Enterprise within their organization’s cloud (AWS, GCP, Azure) or on a bare-metal server. Plotly does not host Dash Enterprise — the install is entirely within your organization’s VPC or firewall.

Once installed, data scientists have the agency to build, deploy, and share Dash apps on the platform without needing additional IT resources.

Dash Enterprise enables app development by data scientists

We’ve found that ML & AI projects are most successful when data scientists and domain specialists share live data apps and dashboards, not just APIs, with their stakeholders.

To shift app and dashboard development onto data scientists, traditional app development and deployment needs to be radically simplified. The traditional tools for building and deploying websites were built for the traditional software or IT developer, not a data scientist. In the traditional stack, every aspect of the app development is performed by a specialist: The frontend by a frontend engineer, the backend by a backend engineer, the design by a CSS developer, the deployment by a DevOps engineer.

Dash Enterprise provides the platform for data scientists to build these apps without becoming specialists in every layer of the traditional software development stack.

The Dash application framework enables data scientists to build these apps entirely in Python. A data scientist’s Jupyter notebook can be transformed and published into an interactive app.

Interacting with a figure in Jupyter notebook within Workspaces

The layout and theme of the application is powered by Dash Enterprise Design Kit, enabling data scientists to create beautiful, on-brand apps with minimal lines of code (and no CSS required!):

Theme Editor in Dash Enterprise Design Kit

App and Callback-Level Security is built-in to the platform. Stakeholders log in to your apps with your organization’s Single Sign-On (SSO) over LDAP or SAML. Data scientists and IT admins configure who gets to see the app through the UI:

App configurations in Dash Enterprise

Apps are built entirely in Python, integrating directly with the data scientist’s ML routines and APIs. Code can be written on the platform in a Dash Enterprise workspace or uploaded to the platform with git.

Click the “Deploy” button in the top bar to deploy with one click!

Additional enterprise capabilities are built on top of the open source framework for specific use cases like PDF reporting or embedding the apps within other websites.

Embedding apps in other websites

Conclusion

Successful AI & ML projects include interactive data visualization at every stage of the process including exploratory data analysis, exploring model APIs, explainability, model monitoring, and more.

Dash Enterprise enables data scientists to publish analytic apps and dashboards to accompany every stage of the ML lifecycle.

--

--

Elliot Gunn
Plotly
Writer for

Engineer @ Plotly, previously data + editorial @ TDS