Introducing Data Prism, The Automatic Chart Builder

How automatic data visualization works and how you can use it in your visualization workflows

Elijah Meeks
7 min readApr 3, 2023
Collections of charts automatically generated from several different datasets — explore the interactive version here.

At Noteable we’re proud to provide you with DEX, the best-in-class no-code data visualization tool available in a data notebook. DEX gives you an integrated BI tool in your notebook, enabling you to explore and explain your data without writing any code. It provides access to over 40 chart types ranging from the classic charts like bar and line charts to complex views into your data like the scatterplot matrix, funnel/clickstream views, maps and network diagrams with options to customize each. We’ve created the largest library of built-in visualization types because we know that with more views into your data, you can find more insights that aren’t apparent in just the basic charts.

But we know that these less familiar charts can be intimidating and hard to use with your data. That’s why we’ve created Data Prism, an automatic data visualization creator that suggests charts based on the shape of your data. This allows you to get to insights faster and show you how you might discover new insights with charts you might not see in more limited BI tools.

Fully automatic views into your data

Simply by clicking Data Prism, you’ll see a collection of suggested charts providing a quick visual overview, revealing different patterns in your data without any effort.

We see a visual analysis of data from The Squirrel Census. The first chart is a tile map because Data Prism correctly identifies the lat/long columns to show you your squirrels colored by the Shift column. Alongside the map are distributions and plots of other fields using bar charts, boxplots, treemaps, scatterplots or even, in the case of Unique Squirrel ID, a word cloud.

How does it work?

Data Prism takes your dataset and maps it to charts available in DEX to provide a set of different views into your data. It adjusts the settings of those charts to optimize the chart for the data. Under the hood, we’ve associated collections of metrics and dimensions to different chart types and settings. We leverage heuristics to identify whether your dimensions more resemble categories or individual items or possible network data. The metrics are compared with each other to determine whether it makes sense to use approaches like dumbbell plots or multi-axis charts and ultimately the viability of the chart uses further rules to ensure that you don’t get single-item pie charts or a histogram of row-level data.

Identifying special data cases

Data Prism doesn’t see all metrics and dimensions as the same. If your column of data seems to have country names, it will suggest a choropleth map. If it has what looks like latitude and longitude, it will plot points on a tile map. If it sees dates, you’ll receive time series charts, with specific cases for revealing more complex time series charts like area charts and candlestick charts. And because DEX provides functionality for visualizing networks and funnels, we pay special attention to columns of strings that have comma-delimited data in order to direct those fields into funnels and network data.

How does it work with you?

Automatic data visualization is always exciting. The promise of the tool finding and highlighting insights without any work from an analyst has its appeal whether you’re the analyst or the leader of a data organization. But as we’ve seen with ChatGPT, automation is here to supplement the work of experts, not replace it. Data Prism was designed to be part of the new workflows that are popping up around AI and other automated tools.

Zooming into generated charts

We can’t guarantee that every view is going to reveal new insights, but the charts that Data Prism shows might start you down a path of analysis that you didn’t predict. That’s why we let you zoom into the chart and modify its settings based on your domain knowledge.

A messy connected scatterplot quickly turns into a structured pattern with a few adjustments.

In this case, hourly data about power generation is first shown with a set of automatically determined charts, but the user sees something interesting in the connected scatterplot even though it doesn’t have the settings that a domain expert knows would reveal deeper insights. By zooming in and adjusting those settings, the user becomes the human-in-the-middle of a feedback loop between automatic data visualization and insight generation.

Prioritizing fields

You know your data best, so Data Prism also allows you to select specific fields and return a visual report of different views based on those fields. Because it’s only focused on a subset of the data, the results will likely be more useful for insight discovery and you’ll also receive more chart variety for just the selected fields.

In this case, the automatic results for mortgage data are interesting but there’s one dimension (Geography) and two metrics (Delinquency Rate and Mortgage Amount) that we’re particularly interested in. By specifying those, we get a much more detailed view into just how those fields interact as a multi-axis bar chart, a scatterplot, a metric bar chart and a parallel coordinates view showing just those three fields. By using the built-in filtering functionality of the parallel coordinates, we quickly get to charts for the three top regions and can immediately see how one of those regions has a very different correlation between delinquency and mortgage amount.

What makes it different from other solutions?

Automatic data visualization approaches aren’t new. They range from the most simple interfaces that give you a bar chart when you select one metric and a scatterplot if you select two, all the way to sophisticated research solutions that analyze the patterns in the data to facet across metrics and dimensions that show strong anomaly and outlier signals.

Data Prism is different because of the complexity of the different views it provides. If you want to visualize two dimensions and a metric in a normal automatic data visualization solution, it will give you one possible chart (and most won’t handle it well at all) but we’ll give you a dimension matrix, a sankey diagram showing connections between those dimensions encoding the magnitude in the sankey edges and even parallel coordinates to see all the data at once.

We do all this because we designed DEX from the ground up to enable faceting, an approach well-known in data science for seeing data across metrics and dimensions. But while faceting approaches in traditional tools normally only provide faceting by color, dimension or metric (which DEX already supports) we believe that faceting by chart type provides you with an added level of power in your analysis and explanation.

How can you use it?

Data Prism is easy to use. After loading your data you can click Data Prism Suggestions and see the Fully Automatic Approach immediately. And once you have that visual overview of your data, you can prioritize fields and see more focused reports. Here are some approaches you might consider.

For datasets that you’re familiar with to create new learnings

Data Prism can help you uncover unique relationships and insights even on datasets that you are highly familiar with. Discover alternate approaches and visualizations that could inspire your next report.

For new datasets

Similarly, we’ve all been tasked with diving into new datasets that we have little context or familiarity with. Using Data Prism, you can get an immediate sense of the shape of the data. From there, diving deeper into a particular combination of metrics and dimensions is simply a matter of a few clicks.

To grow your team’s data visualization literacy

By integrating Data Prism into your practice, your team will be exposed to and grow more familiar with different visual forms that might have once been too intimidating to use for analysis and explanation. Every time you use Data Prism you’ll get results that are familiar along with results that aren’t, and that exposure will help you and your stakeholders break down barriers to using “exotic” or “complex” charts because that familiarity will grow into literacy. This is a key theme of my video series Designing for the Data Visualization Lifecycle, which explains in more detail the virtuous cycle that emerges from growing your team’s data visualization literacy to tackle new and effective approaches to understanding your data visually.

Get started right now

You can experience the fully interactive Data Prism on any Noteable notebook just by loading some data using SQL or python. Here’s an example notebook that shows off the functionality to get you started.

--

--

Elijah Meeks

Chief Innovation Officer at Noteable. Formerly Apple, Netflix, Stanford. Wrote D3.js in Action, Semiotic. Ex-Data Visualization Society Executive Director