The Cloudera Data Platform (CDP) Public Cloud provides the foundation upon which full featured data lakes are created.
In a previous article, we introduced the CDP platform. This article is the second in a series of six to learn how to build end-to-end big data architectures with CDP:
- CDP part 1: introduction to end-to-end data lakehouse architecture with CDP
- CDP part 2: CDP Public Cloud deployment on AWS
- CDP part 3: Data Services activation on CDP Public Cloud environment
- CDP part 4: user management on CDP Public Cloud with Keycloak
- CDP part 5: user permission management on CDP Public Cloud
- CDP part 6: end-to-end data lakehouse usecase with CDP
More specifically, we are going to:
- Create a credential that permits CDP to manage resources on AWS
- Configure an AWS CloudFormation stack that serves as root of our deployment
- Deploy a CDP Environment including a Data Lake to AWS
The configuration and deployment can be accomplished via the web interfaces of Cloudera and Amazon — generally referred to as the AWS console or the CDP console — or via their respective CLI tools. We cover both…