Effortless Exploration with Zenvisage 0.2
This blog post describes our new software release for Zenvisage. Blog post written by Doris Lee and Tarique Siddiqui, edited by Karrie Karahalios and me. Contributors to this release (in alphabetical order) include: Jaywoo Kim, Doris Lee, John Lee, Tarique Siddiqui, Chao Wang, Ed Xue, and Zhiwei Zhang.
Zenvisage is a visual query system that accelerates exploratory data analysis by helping analysts fast-forward to their desired visual patterns or insights.
Our second release of Zenvisage features a number of new querying mechanisms, improvements to existing functionalities, as well as many more options for customization. We developed these features through a year-long collaboration with users across several domains, including astronomy, genetics, and material science. Moreover, as part of this release, we now have a live deployment open to the public, where users can play with already uploaded datasets, and can even upload and explore their own datasets.
Here’s my current favorite example of Zenvisage in action: a one-screen summary of the fact that global warming is real. All of our representative patterns for temperature over time are increasing trends of various shapes; all of our outlier trends for temperature over time are decreasing trends of various shapes. You can play with this dataset, as well as many others, in our live version.
Here are some highlights; for more details you can refer to our wiki here.
In addition to sketching and drag-and-drop, we now support other querying modalities including querying with input equations and externally loaded patterns.
This functionality allows users to sketch a desired pattern as a query. For example, find me all the cities who have housing prices on the rise.
Zenvisage allows you to drag a visualization of interest either from the results or overview panel, and drop it onto the querying canvas to submit as a query.
Sometimes the pattern you are interested in may not be easily found through representative or outlier patterns, and sketching is too imprecise as a query. Pattern loading enables users to upload the x,y data points of the pattern of interest as a query. This is especially useful when you have labeled data from other experiments, surveys, or outputs from simulations that you want to compare against your dataset.
An Equation As Input
This function allows you to specify a query pattern using an equation, e.g. y=sin(0.5*x). This is useful if you know upfront that the data that you are looking at follows some known functional form that relates the x and y axis variables.
Dynamic Faceting and Comparisons
Zenvisage offers two functionalities for dynamically faceting and comparisons between different subsets of the data through filtering and dynamic classes.
Filtering is a common operation in interactive visual analytics. In Zenvisage, users with large datasets with many data attributes can narrow down their search to a subset of data that may increase their chances of finding an interesting pattern for a given query. To filter the data, users could submit one or more SQL-like predicates (e.g. income <50000 AND age< 50) as filter constraints in a text field and the result of the filter will be globally applied to all the displayed query results.
When performing visual query analysis, we learned that users often want to create subsets (or classes) of data on-the-fly to make comparisons between them. We implemented a feature called dynamic class creation. This feature allows users to bucket data points into customized classes based on existing properties to compare between the customized classes. For example, an example dataset contains personal information consisting of income, age, high school GPA, and miles driven in a year. Using dynamic class creation, you could divide the groups based on income and age and compare any differences in their GPA vs. miles/yr visualization for the 4 different income and age groups.
Interactive Data Smoothing
When the data is noisy, the global trends in the visualizations can often be overwhelmed by local noise, leading to bad pattern matching. Smoothing tries to capture important features in the data while alleviating the noise. In order to tighten the loop between data preprocessing and querying, we’ve incorporated an module that enable users to interactively tune the smoothing algorithm and parameters to visualize how various smoothing parameters impact query results.
There are many more features in this release, including data export and display options, which we have detailed in the user manual. Feel free try out a live demo of Zenvisage or download the source code from Github. More details can be found in a previous blogpost about our VLDB paper or our technical report.