Facets : An Open Source Visualization Tool for Machine Learning Training Data

Working with the PAIR initiative, Google have released Facets, an open source visualization tool to aid in understanding and analyzing ML datasets

Summary: Facets Overview automatically gives users a quick understanding of the distribution of values across the features of their datasets. Multiple datasets, such as a training set and a test set, can be compared on the same visualization. Common data issues that can hamper machine learning are pushed to the forefront, such as: unexpected feature values, features with high percentages of missing values, features with unbalanced distributions, and feature distribution skew between data sets.

Facets Visualisation

The full post published by the Google Big Picture team is available here.

I will do a deep dive on facets in the coming weeks, follow me on medium if you want to know more!