Data Analysis for Everyone: Self-Service Data Exploration Made Easy

Jim Scott
The Ramp
Published in
4 min readOct 26, 2015

--

Every company is now a data-driven company — and data analysis is too important to limit its use to experts.

A new report from Capgemini, a leading provider of consulting, technology, and outsourcing services, indicates that two-thirds of organizations (67 percent) surveyed feel that they are at risk of becoming uncompetitive unless they embrace new data analytics solutions.

While the biggest and most critical data-driven decisions should likely be left to data scientists and business experts, that’s no reason to deny other professionals the insight and intelligence that can be gleaned from big data exploration. But how does an enterprise enable those who are uncomfortable with creating a spreadsheet with pivot tables to dive into data analysis? The answer is simple — give them powerful tools that are designed to be user-friendly.

Being able to use an application without ever reading the manual is a delightful concept for many people, but the more technically inclined will wonder if “simple” translates to limited functionality. It can, of course — but the right tools will offer robust features with an interface that makes sense to users.

Defining Big Data

Before we move on to the solutions, let’s describe Big Data. For some, the term indicates a huge and ever-growing collection of useful information. For others, it refers to a collection of unstructured data, or data that is sourced from venues such as social media, sensors, and public databases. But Big Data is all of these things and much more: It’s a way to extract information from all types of data (large, small, new, old, numerical, textual, and images). Big Data can be contained in warehouses, lakes, hubs, sandboxes, and (unfortunately) swamps. It’s historical, near-time, and real-time. And it’s up to each organization to define which benefits Big Data can bring to their business processes.

Most companies wisely launch their Big Data initiatives by choosing a project or two where data analysis can provide a measurable boost to their bottom-line, efficiency, and/or ability to manage risk more wisely. Once Big Data has proven its worth, more projects are rolled out. Many enterprises now have multiple Big Data initiatives in place, and as we move forward from a time when data was carefully structured and stored in silos, we should also think about utilizing Big Data as a daily activity that powers a multitude of business decisions. In other words, everyone is, will be, or should be a data analyst.

Dig Deep with Drill

Industry research firm Wikibon states that 52% of Big Data tool investment is devoted to technologies that ingest and organize data so that it can be more efficiently prepared for analysis.

Preparing data for use with analytic tools has long been a huge time investment, with the Wall Street Journal reporting that many organizations’ IT departments devote 75%-80% of their energies to upfront data engineering. The Journal, in an article on “Big Data 2.0” by Randy Bean, goes on to note that “Big Data approaches democratize data integration by enabling non-technical users to directly access the data they need for analysis. As a result, businesses have more options to choose from and more approaches to consider. Developing an effective data integration framework becomes the first step in deriving business value from their data.”

Tools that provide an integrated view of data are equally important to the success of any Big Data project. Among the tools available, Apache Drill enjoys strong popularity. Drill is not itself a database; it is a standalone query engine that can combine data from multiple sources in a single query.

Drill is well-suited for expert-use analysis and tasks such as data exploration, discovery, ad hoc BI queries, and day-zero analytics. It scales from a single laptop to a large cluster of servers. Data can be queried in its native formats, including nested data, schema-less data, and dynamic data. Since there is no need to explicitly define and maintain schemas, Drill enables self-service data exploration. It can be utilized by someone whose job description doesn’t include “data scientist” or “business analyst” (though people who haven’t worked with data may find Drill a bit intimidating).

Explore Easily with MicroStrategy Analytics Desktop

MicroStrategy Analytics Desktop was designed for people who need to visualize, analyze, and share data but lack the technical experience or IT resources to carry out the task. Using this self-service business analytics solution, anyone can access and explore data without expert support. Chances are a user won’t even need to glance at a manual or the help files.

There’s no need for complex scripting. Users can simply connect to a data source; discover trends, patterns, and anomalies; and then organize their finding into visualizations, display them on interactive dashboards, and even click to share those dashboards by email. This is true even for more ‘semi-structured’ data sources such as social media, which normally require more upfront work to analyze.

Data can be dragged and dropped into dashboards, and effortlessly sorted, pivoted, aggregated, and statistically analyzed. No expertise in database querying languages, calculus, statistics, machine learning, or Data Munging (cleanup) required.

See for yourself in this video demo of MicroStrategy Analytics Desktop working in conjunction with Apache Drill.

And if you would also like a hands-on test drive of Drill, download MapR’s Hadoop Sandbox with Apache Drill.

Originally published at www.socialmediatoday.com on October 26, 2015.

--

--

Jim Scott
The Ramp

Digital Transformation and Emerging Technologies Leader | Head of Developer Relations, Data Science @NVIDIA