How Can Visualization Help Enterprise Analysts?

Published in

VisUMD

3 min readSep 30, 2019

The role of visualization for business intelligence.

Many companies today rely on data analysis in making decisions for their everyday business activities. Numerous visualization tools and applications have been deployed to help with the process. However, are these tools used effectively? To answer this question, Sean Kandel and his research team conducted an in-depth interview study involving 35 data analysts from 25 organizations in healthcare, retail, marketing, and finance sectors. The team provides four main suggestions for designing visual applications based on the challenges and trends found in the interviews: (1) dealing with breakdowns, (2) providing scalable visual analytics, (3) bridging gaps in programming proficiency, and (4) creating metadata for convenient documentation of intermediate products.

Dealing with Breakdowns

According to Kandel and his coauthors, the typical analysis workflow is split into 5 different tasks: discover, wrangle, profile, model, and report. The team found that many of the breakdowns in analysis occur in the early phases of the workflow, i.e., the discovery and the wrangling phase, or in transitions between them.

However, current tools have bottlenecks preventing them from being used widely because of barriers in programming skill and difficulties in migration of files between different formats. For example, Splunk flexibly deals with unstructured data, but may be too difficult for novice programmers to use. OpenRefine is a well-known tool for data transformation, but can be difficult to use when converting data into another format. In other words, this is an underdeveloped area that could significantly benefit enterprise data analysis.

Towards Scalable Visual Analytics

In many cases, analysts need to handle large-scale data. However, the size of data that can be processed is limited in existing popular visual analytics tools such as e.g., Tableau, Microsoft Excel, and PowerBI. In the interview, 31 out of 35 respondents claimed that the existing analytic tools and algorithms were not compatible with the size of their datasets. Even the so-called “hackers”, analysts proficient in programming sophisticated tools, said that it is “difficult to take powerful algorithms that work on medium data and make them pluggable in the big data stock.”

In addressing this issue, the paper suggests to (1) leverage existing data processing engines (e.g. Hadoop) for manipulating data, and (2) take approximation approaches (e.g. sampling).

Alleviating Programming Gaps

The paper categorizes analysts into three archetypes: scripters, hackers, and application users. These archetypes are based on the level of their programming skills. Hackers have the most proficient programming ability, but with relatively lower knowledge of statistics. Application users are those with the least programming abilities but more knowledge of statistics.

The paper states that the inability of scripters and application users to code limits the effectiveness of their skills, and hinders the collaborations between different analysis groups. In addressing this, the paper asserts that visual analytics systems should be used to bridge this programming gap by providing new interfaces for data acquisition and wrangling.

Metadata for Intermediate Products

In an organization, people often conduct numerous intermediate tasks to accomplish their target goals. However, the products resulting from these intermediate tasks are not accumulated, and thus analysts must repeat similar procedures from scratch when performing similar tasks. From the interview, 18 out of 35 analysts said it was difficult to search for an intermediate product; in fact, searching was often more time-consuming than rewriting the script from scratch. The research team ascribes this to the way the data and the scripts are shared. In many cases, analysts are hesitant to spend time in documenting the process. One approach to tackle this, the team suggests, is to impose conventions or constraints in naming the file. If some structure can be imposed on the naming procedure, the metadata can be searched over more easily in the future.

Summary

As the importance of collecting data grows, more people are employed in analysis teams and more sophisticated skills are required to analyze the resulting data. There exist numerous visualization tools that can contribute towards the effectiveness of data analysis within organizations. Nonetheless, there is still more room for improvements using visualization tools to improve the analysis process in companies.

This article is inspired by the following paper:

Sean Kandel, Andreas Paepcke, Joseph M. Hellerstein, Jeffrey heer, “Enterprise data analysis and visualization: An interview study.” IEEE Transactions on Visualization and Computer Graphics 18(12):2917–2926, 2012. DOI: 10.1109/TVCG.2012.219

Click here to view the PDF version of the paper.