Data Refining simplified: Part 1 — Visual exploration and analysis development (video)

Sonali Surange-Dev
2 min readAug 19, 2019

--

IBM’s Data Refinery accelerates the end-to-end experience of refining data from development to production.

The daily activity of a data scientist is often to explore data and build analysis with a test data set, refine the workflow and then automate it for real-world data. This can be a time consuming, repetitive, and laborious process.

Data Refinery accelerates the effort to refine raw data into analysis-ready data. To expedite the process of data refining, visual snapshots, quality profiles and intelligent feedback at each step is provided by the tool.

Using Data Refinery, Data Scientists can transform, normalize and combine data from any domain using over 100 operations on numeric and non-numeric such as text, dates, etc. Over 23 charts are provided as a visualization aid. These include scatter- plots, word clouds, t-SNE, heat maps, 3D charts, and others to expedite visual understanding of the data.

Visual Data Refining

New features in Data Refinery enable data scientists to proactively adjust their complex flows developed on test data to be ready for real-world data, before executing them. They can automate their analysis to run on an hourly, daily, weekly or monthly schedule. As an added benefit, they can personalize the execution profile to match real-world data volumes.

This video shows how Data Refinery expedites visually creating analysis on test data

  1. Visual exploration of data to get a sense of data distributions and quality
  2. Visualization guided data transformation to build the use-case specific analysis
  3. Validate analysis of test data

IBM’s Data Refinery is available with Watson Studio, Watson Knowledge Catalog on public cloud, private cloud, and Watson Studio Desktop.

--

--