Building a Data Lake on Google Cloud Platform with CDAP

Nitin Motgi
May 26 · 11 min read

Data Prep and Pipeline Integration with Google Cloud Storage

Google BigQuery Integration

Google PubSub Integration

Use cases

EDW Offload | Oracle CDC to Google BigTable

Moving between Clouds | Amazon to Google and vice-versa

AI Integration | Translating audio files using Google Speech Translator

{
"path": "/audio/raw/audio.raw",
"speeches": [
{
"confidence": 0.9876289963722229,
"transcript": "how old is the Brooklyn Bridge"
}
]
}

Conclusion

cdapio

CDAP is a 100% open-source framework for build data analytics applications

Nitin Motgi

Written by

Nitin Motgi is Founder and CTO of Cask, where he is responsible for developing the company’s long-term technology and driving company engineering initiatives.

cdapio

cdapio

CDAP is a 100% open-source framework for build data analytics applications