CDAP Wrangler User Interface

CDAP Wrangler makes it delightful to transform, cleanse, standardize, harmonize, DQ checks, and enrich data in a code-free manner within data pipelines[1]. While Wrangler provides a ton of built-in functions and Directives to manipulate data, there will always exist gaps. In order to fill gaps, Wrangler provides an extensible framework through User Defined Directives (UDD) that helps define custom directives for manipulating data.

UDDs are similar to User-defined Functions (UDFs) that have a long history of usefulness in SQL-derived languages and other data processing and query systems. While SQL can be rich in their expressiveness, there’s just no way they…


It’s often the case that you deal with incomplete or messy datasets all the time. Data from varied sources can be unusable in the beginning but once the data is transformed, mapped and cleansed it becomes usable. As majority of data scientists, data engineers and analysts time is devoted to transforming, cleansing and mapping data, rather focusing on extracting insights, building analytics pipelines and models that leverage that data is their goal.

In summary, taking messy data that it complex and make it useable for further analysis you need to wrangle with data. Furthermore if you need to operationalize the…


Hello everyone! It’s nice to be back after a long pause. Has been a while since we have blogged on CDAP. It’s this month, last year that Cask was acquired and since then a lot has happened with CDAP as well as around it.

Before I get into the details on what we have been up-to in the CDAP world, I would like to thank all our users, partners, and customers for their tremendous support during this transition. You all have been incredible during last 12 months. …


It is no secret that traditional platforms for data analysis, like data warehouses, are difficult and expensive to scale, to meet the current data demands for storage and compute. And purpose-built platforms designed to process big data often require significant up-front and on-going investment if deployed on-premise. Alternatively, cloud computing is the perfect vehicle to scale and accommodate such large volumes of data in an economical way. While the economics are right, enterprises migrating their on-premises data warehouses or building a new warehouse or data lake in the cloud face many challenges along the way. …

Nitin Motgi

Nitin Motgi is Founder and CTO of Cask, where he is responsible for developing the company’s long-term technology and driving company engineering initiatives.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store