Technology Friday: Trifacta Wrangler
Today’s technology Friday takes us to the data quality management space and one of my favorite products: Trifacta Wrangler. Data quality management sounds as boring as it is important. The thing is that, for the last couple of decades, data quality management has been just that: boring.
Have you ever tried to implement a data quality management solution using some of the platforms from traditional data incumbents such as Microsoft, Informatica, Oracle, etc? The experience is nothing short of nightmare. Hundreds of hardcoded data quality rules that are constantly falling out of sync with the data, an archaic user experience that requires constant training are some of the challenges you can look forward to if you decide to pursue that path.
Due to some of those challenges, data quality management platforms has none been able to enjoy mainstream adoption in the enterprise. However, its relevance has only increased with the emergence of new trends such as big data and machine intelligence(MI). Currently, a new generation of startups is reimagining data quality management using new technologies such as machine learning(ML) and artificial intelligence(AI). Among those, Trifacta Wrangler stands up as one of the leaders in the space.
Trifacta Wrangler is a new generation data quality management platform that enables data stewards to analyze, cleanse and transform datasets in order or power other data processes in the enterprise. Trifact Wrangler addresses some of the limitations of its predecessors by leveraging advanced machine learning and data visualization models that provide an engaging experience for data stewards.
Trifacta Wrangler is a highly sophisticated technology stack bt its guided by two simple principles: Predictive Transformation and Visual Profiling.
Based on a strong academic research, Predictive Transformation is a group of design and interface principles that guide the user’s interaction with the data. Predictive Transformation combines domain knowledge of the data with an advanced transformation engine.
Machine learning and advanced statistics are at the core of Trifacta Wrangler’s Predictive Transformation processes. The stack includes sophisticated transfromation routines such as data cleansing (standardization, data removal, etc), statistical manipulation (outliers, profiling, etc), enrichment(data joins, lookups, etc), distillation( aggregation, sampling, filtering, etc. ), restructuring( data extraction, pivot/unpivot, etc) and several others.
trifacta Wrangler Visual Profiling are a series of techniques that provide real-time, highly interactive visualizations that can assist with the discovery and interpretation of datasets. Trifacta data visualizations highlight aspects such as statistical summaries or outlier patterns that can simplify the exploration of datasets.
Machine learning is at the core of Trifacta Wrangler. The platform provides predictive models that suggest patterns and transformations to apply to specific datasets. Additionally, Trifacta Wrangler enables the preview of transformation before they are actually applied.
Trifacta Wrangler is supported on both on-premise and cloud platforms. Notably, Trifacta has been able to establish robust strategic alliances with cloud platforms such as Google Cloud that can streamline the distribution of the platform.
Trifacta certainly brings a new approach to the very difficult challenge of data quality management in the enterprise. Steadily, Trifacta is becoming a key elements of modern enterprise data pipelines.