Providing a sneak peek into one’s data through visual autocompletion

Vidya Setlur
ACM UIST
Published in
7 min readOct 19, 2020

During web search, a user may deliberately want to know something, but may not necessarily know exactly how to formulate their query. The translation of some vaguely-felt intent to a more concrete description of the desired information is often a cognitively demanding activity that depends on recall and sense-making.

Autocompletion is a ubiquitous tool that helps alleviate this cognitive load by offering predictions of the rest of the query as typeahead suggestions. Those suggestions are often related to the query and help the user in query completion. Autocompletion often mainly focuses on syntactic query completion, guiding the user by prompting them with likely completions and alternatives to the text as they are typing. The advantage of this affordance is that it reduces the amount of characters a user needs to type before executing any search action, leading to an improved search experience.

Visual analysis querying is like shooting darts in the dark

Natural language (NL) interaction in visual analysis tools is a promising modality for supporting expressivity in the way users interact with their data. Users using these NL interfaces have similar challenges around query formulation as in web search. However, visual analysis queries are different as they also contain user intent about the characteristics of the underlying data and its domain. Standard autocompletion typically helps with syntactic query completion and does not provide any guidance on formulating successful queries about the underlying data per se. These visual analysis queries end up being either too broad, narrow, or ill-formulated resulting in the user having to reformulate the queries to return use visualization results; It’s like shooting darts in the dark.

There is a need for autocompletion to extend beyond syntactic completion to also help guide users with helpful previews of the data. These previews indicate which aspects of the data domain would be useful to explore further through thoughtful filtering of the data in the queries. We developed a novel interface system called Sneak Pique that provides suggestions to analytical expressions during a user’s visual analysis workflow.

The Sneak Pique interface

Fig. 1. The Sneak Pique interface showing a dataset of coronavirus cases around the world. Here, the user starts typing “show me cases in” and suggestions show up as map and calendar widgets for adding place and data values to the query.

The system implements a set of text- and widget-based autocompletion suggestions that provide data previews of the results before they are realized in the visualization. Figure 1 shows autocompletion suggestions generated in Sneak Pique as a user explores a dataset of coronavirus cases around the world. The widgets provide data previews of the underlying geospatial and temporal data values; darker colors indicate a higher data frequency.

How the system works

Fig. 2. Overview of the Sneak Pique system.

The Sneak Pique system is based on a web-based client server architecture as shown in Figure 2. The user’s query is processed by an ANTLR parser (A) that uses a context-free grammar with both predefined and dynamic rules based on the attribute values from the dataset. The parser can access the dataset and its properties through the Data Manager (B). The Autocompletion Detection module (C) polls the query as the user is typing and triggers grammar parse tree errors when the query is partially complete. These parse errors are passed to the Autocompletion Generator (D) that introspects on the syntactic structure of the partial query along with relevant grammar rules that would be satisfied if the query were complete. The module determines the type of autocompletion suggestion required to resolve the partial query into a complete one. With the help of the Data Manager, the module computes the necessary data preview information that would be displayed in the autocompletion suggestion. The autocompletion suggestion is then rendered in the user interface of the client (E). Any interaction that the user performs with these autocompletion suggestions is captured by the Event Manager (F). The query upon execution, updates the D3 visualization result (H) through the Analytics Module (G).

Crafting the design space for Sneak Pique widgets

To implement the actual suggestions in the Sneak Pique interface, we drew from a set of autocompletion best practices. We ensured that the suggestions showed up in context of the query as the user was typing and were relevant to the task. For easy lookup, we ensured that the items were grouped by data type with minimal visual and interaction complexity. After all, autocompletion should only provide a “sneak peak”, and not the final visualization result. Using these design guidelines, we implemented various text and widget-based autocompletion representations as shown in Figure 3 below.

Fig. 3. The design space for autocompletion variants implemented in Sneak Pique. Here, Data Preview (DP) shows suggestions with data frequency numbers of the data values.

To further explore how useful showing the data previews in the suggestions are, we generated variants with and without data frequency numbers. We also explored different sort orders based on the data type such as alphabetical and data frequency order. We generated different visual representations for the autocompletion suggestions based on the data type. For example, a text list and a bar chart was used for categorical text strings (see Row 1). A text list and a slider widget were used to display numerical data, with histograms showing the data frequency information (Row 2). Map widgets showed geospatial data with color hues indicating the data frequency; darker the color, higher the data frequency (Row 3). Slider and calendar widgets were used to display date values. Similar to the map, color was used to encode data frequency information (Row 4). We also generated a text suggestion variant grouping place and data strings for completing ambiguous queries such as “show earthquakes in” that could be completed with either a place or date (Row 5). A nested text list was also generated to surface hierarchical data such as states and cities.

Which autocompletion variants are actually useful?

Our work is one of the first systems that explores autocompletion suggestions specifically designed for visual analysis search tasks. There was no precedence to indicate what user preferences are for each of these autocompletion variants and how those preferences vary based on data type, sort order, or actual representation. We ran three experiments to tease apart user preferences for these various factors:

  • Experiment 1 compared autocompletion variants with and without data frequency information displayed to understand if such data previews are useful to the user.
  • Experiment 2 examined the sort orders that would be useful to apply to items shown in text autocompletion suggestions.
  • Experiment 3 compared autocompletion variants with and without hierarchies to better understand the handling of hierarchical data in the suggestions.

We hypothesized that participants would find data preview information to be useful with the text suggestions showing relevant items sorted in descending order of their data frequencies. We also hypothesized that temporal information is more useful when sorted in chronological order unless it is displayed in a calendar widget.

We ran the three experiments on Mechanical Turk and the results from the studies helped identify a subset of autocompletion variants in the final implementation of Sneak Pique. Observations from the studies indicated that showing data previews was indeed useful. Users also wanted the suggestions with nulls filtered out as they did not want to select an item that had no values displayed in the visualization result. Interestingly, none of the participants chose bars for displaying data frequency numbers and that variant was eliminated in the final implementation.As expected, data frequency was the preferred sort type other than chronological order for date values. Text autocompletion was shown to be more conducive than widgets for drilling up and down hierarchical data values.

From Figure 4, we finalized text (Row 1c), histogram sliders (Rows 2c and 4g), map (Row 3e), calendar (Row 4f) widgets for displaying categorical, numerical and temporal ranges, geospatial, and temporal value completion respectively.

Final thoughts and future directions

The final implementation of Sneak Pique showed to be a useful tool for helping people craft productive queries for visual analysis. The data previews were found to be useful with participants putting more deliberate effort while trying to filter to a subdomain of the underlying data.

Observations from the study showed how autocompletion affected how queries were formulated; there were fewer query reformulations as participants utilized data previews to complete the tasks. Participants used the data previews as a scaffold to construct compound queries incrementally by adding additional filters to the original queries such as “show me cases in Africa between April 2020 and May 2020.” While our analytics platform was simple, future work should explore how visual autocompletion workflows adapt in the context of more established, complex visual analytics tools.

Balancing the need for interaction simplicity, yet providing complex data previews for a variety of data types, is a tension for user interface design. Exploring what that sweet spot is for visual analysis and keeping the behavior fast and interactive, is definitely a challenge. Personalizing the autocompletion experience to a specific user and the semantics of the data is another area of interesting research.

In summary, this work provides new insights for developing novel user interaction experiences for natural language tools in visual analysis.

Authors

Vidya Setlur, Tableau Research
Enamul Hoque, York University
Dae Hyun Kim, Stanford University
Angel X. Chang, Simon Fraser University

Citation

Vidya Setlur, Enamul Hoque, Dae Hyun Kin, and Angel X. Chang. 2020. Sneak Pique: Exploring Autocompletion as a Data Discovery Scaffold for Supporting Visual Analysis. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology (UIST ‘20). Association for Computing Machinery, Virtual Event, USA. [paper]

--

--

Vidya Setlur
ACM UIST

Research scientist at Tableau interested in semantics, computer graphics, and natural language interaction for visual analysis. Opinions are my own.