How to Explain, “Why Self Service Data Prep?”

How to Do It Right

O'Reilly Media
oreillymedia
4 min readDec 10, 2020

--

Editor’s Note: Data preparation encompasses many tasks, such as confirming the accuracy of data points, removing extraneous data, reshaping the data set to optimize it for easy analysis, and doing all of this efficiently so people aren’t left waiting for the data to be ready so they can answer their questions. In this piece from, Tableau Prep: Up & Running, Tableau Zen Master and Data Coach, Carl Allchin, gets you up to speed on how self-service data preparation reduces the time it takes to complete data projects and improve the quality of your analysis.

With every organization swimming in data lakes, repositories, and warehouses, never before have their employees had such an enormous opportunity to answer their questions with information rather than just their experience and gut instinct.

This isn’t that different from where organizations stood a decade ago, or even longer. What is different is who wants access to that data to answer their questions. No longer is the expectation that a separate function of the business will be responsible for getting that data; now, everyone feels they should have access to it. So what has changed? Self-service data visualization. What is about to change to take this to the next level? Self-service data preparation.

The rise, and entrenchment, of self-service data visualization into individuals’ roles have surfaced needs and tensions in the analytical cycle. The analytical cycle involves:

  1. Having a question posed from someone
  2. Sourcing data that may help answer the question
  3. Preparing the data for analysis
  4. Analyzing the data
  5. Forming new or additional questions (returning to step 1)

Enabling self-service requires opening access to data sources, which has traditionally been a pain point in this cycle. With the right data, optimized for use in visual analysis tools, we can now find answers as soon as the business expert can form the questions. But accessing the “right data” is not that easy. The data assets owned by organizations are optimized for storage, optimized for tools that now seem to work against users rather than with them, and regulated by strict security layers requiring coding to access the data.

Many data projects are now focused on extracting data from their storage locations. The specialists are focused on using data skills to:

  • Find data in existing repositories, including Excel workbooks
  • Find data in public or third-party repositories
  • Create feeds of data from previously inaccessible sources and systems

The gap in the analytical cycle now sits between taking these sources and preparing them for visual analytics. This gap is being addressed by new tools that enable business experts to access data and answer their questions using self-service visual analytics. Tableau Prep Builder makes the process of data preparation easier than other tools by bringing the same logic that enabled visual analytics to the data preparation process. By using a user interface (Figure 1–1) similar to the one that data visualizers are already accustomed to, Prep Builder makes the transition to self-service data preparation a simple one, even for those trying to complete these tasks for the first time.

Figure 1–1. The Tableau Prep Builder interface

Data preparation isn’t just the process of preparing a data set to make some charts as a one-off exercise. It encompasses many tasks, such as confirming the accuracy of the data points, removing extraneous data, reshaping the data set to optimize it for easy analysis, and doing all of this efficiently so people aren’t left waiting for the data to be ready so they can answer their questions. Good data preparation enables the use of the data set in a timely manner, avoids wasting time on manual manipulation (thanks to tools like Tableau Prep Builder), and creates a repeatable process that anyone can use. The aim of this book is that the time you spend learning these data preparation skills will be repaid many times over as you begin to deploy them in the real world many times over. After all, the more time you spend manually preparing data means the less time you have to analyze it — if you decide to use such a messy data set at all.

There is still a significant gap between potential data preparers (“data preppers”) and skilled ones. Learning what to do with the self-service data preparation tools, and why they are needed, is a significant undertaking but one that is worthwhile.

Learn faster. Dig deeper. See farther.

Join the O’Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

Carl Allchin is a multiple-time Tableau Ambassador and is ‘the Other’ Head Coach at one of the world’s leading data analytics training programs at The Data School in London. After over a decade in Financial Services as a Business Intelligence Analyst and Manager, he supported hundreds of companies through consulting, blogging and teaching on market-leading data solutions. Carl is the co-founder of Preppin’ Data, the only weekly Data Preparation challenge on Tableau and other data tools.

--

--

O'Reilly Media
oreillymedia

O'Reilly Media spreads the knowledge of innovators through its books, video training, webcasts, events, and research.