CDAP in Kubernetes Deployment Guide

Terence Yim
Jul 8 · 4 min read

Start Minikube

$ minikube start --cpus 4 --memory 8192
$ minikube ssh
$ sudo ip link set docker0 promisc on
$ exit

Install CDAP Controller

$ git clone https://github.com/cdapio/cdap-operator.git
$ kubectl apply -f cdap-operator/config/crds
$ kubectl apply -f cdap-operator/config/default/rbac
$ kubectl apply -f cdap-operator/config/default/manager
$ kubectl get pod --namespace=systemNAME                READY   STATUS    RESTARTS   AGE
cdap-controller-0 1/1 Running 0 30s

Install Supporting Services

Installing PostgreSQL

$ helm init
$ helm install --name postgres stable/postgresql --set postgresqlPassword=secretpass,postgresqlDatabase=cdap

Installing ElasticSearch

$ helm repo add elastic https://helm.elastic.co
$ helm install --name elasticsearch elastic/elasticsearch --version 6.5.3-alpha1 --set replicas=1 --set minimumMasterNodes=1 --set resources.requests.memory=500Mi

Installing Single Node Apache Hadoop

$ cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: hadoop
labels:
app: hadoop
spec:
containers:
- name: hadoop
image: sequenceiq/hadoop-docker:2.7.1
---
apiVersion: v1
kind: Service
metadata:
name: hadoop
spec:
selector:
app: hadoop
ports:
- protocol: TCP
port: 9000
targetPort: 9000
EOF
$ kubectl get pod/hadoopNAME     READY   STATUS    RESTARTS   AGE
hadoop 1/1 Running 0 88s
$ kubectl exec -it hadoop -- /usr/local/hadoop/bin/hdfs dfs -mkdir /cdap

Creating CDAP instance

Create Secret for CDAP

$ export CDAP_SECURITY=$(cat << EOF | base64 | tr -d '\n'
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>data.storage.sql.jdbc.username</name>
<value>postgres</value>
</property>
<property>
<name>data.storage.sql.jdbc.password</name>
<value>$(kubectl get secret postgres-postgresql -o 'jsonpath={.data.postgresql-password}' | base64 --decode)</value>
</property>
</configuration>
EOF
)
$ cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
name: cdap-secret
type: Opaque
data:
cdap-security.xml: $CDAP_SECURITY
EOF

Service Account and Cluster Role Binding

$ kubectl create serviceaccount cdap
$ kubectl create clusterrolebinding cdap --clusterrole=edit --serviceaccount=default:cdap

Deploy a new CDAP instance

$ cat << EOF | kubectl apply -f -
apiVersion: cdap.cdap.io/v1alpha1
kind: CDAPMaster
metadata:
name: test
spec:
locationURI: hdfs://hadoop:9000
serviceAccountName: cdap
securitySecret: cdap-secret
config:
enable.preview: "true"
data.storage.implementation: postgresql
data.storage.sql.jdbc.connection.url: jdbc:postgresql://postgres-postgresql:5432/cdap
data.storage.sql.jdbc.driver.name: org.postgresql.Driver
metadata.storage.implementation: elastic
metadata.elasticsearch.cluster.hosts: elasticsearch-master
hdfs.user: root
EOF
$ minikube service cdap-test-userinterface --url

cdapio

CDAP is a 100% open-source framework for build data analytics applications

Terence Yim

Written by

Software Engineer. Passionate about distributed system, big data, and open source software.

cdapio

cdapio

CDAP is a 100% open-source framework for build data analytics applications