Announcing CDAP 6.2.0 Release

Edwin Elia
Jun 1, 2020 · 2 min read

On behalf of the CDAP community, it is my pleasure to announce the release of CDAP version 6.2.0. This release introduces Replication, an easy way to replicate changes from transactional databases into analytical data warehouses. It also enhances the Google Cloud Dataproc runtime provisioner to use the native Google Cloud Dataproc’s job APIs. Additionally, it includes a few improvements to the Pipeline Studio that enhance the user experience of building pipelines.


Replication allows users to create replication pipelines easily. The user interface guides users through the steps of configuring the source database and then selecting the tables and columns from the database to be replicated. Once users have done adding the target configuration, the system will run an assessment of the configuration to determine whether there is any potential issue that needs to be addressed before deploying the pipeline. An assessment stage also reports on the possible issues during replication, including data type mappings between the source and target databases.

Select tables and columns to replicate

Google Cloud Dataproc Runtime Improvement

Previously, Google Cloud Dataproc runtime was utilizing SSH for job submission. This resulted in a requirement that port 22 be open for the environment running CDAP. With this improvement, the job submission uses native Google Cloud Dataproc APIs, thus not requiring port 22 to be open anymore.

Pipeline Studio Improvements

Users can now select multiple plugins by dragging and making selections. Once the plugins are selected users can move, copy, or delete the plugins. Additionally a right click is now possible in the Pipeline Studio canvas. By right clicking, users can add a new wrangler connection or do common actions such as zooming and aligning the plugins.

Right click on the canvas to open the menu

Download CDAP 6.2.0 today and take it for a spin! Also consider helping us develop the platform by reaching out to the community with any comments, feedback, suggestions, or improvements or by creating and following JIRA issues and submitting pull requests.

For Hadoop distributions packages, you can build them from the following repositories:


CDAP is a 100% open-source framework for build data…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store