Deploy Rook/Ceph on ICP

Zhimin Wen
Aug 8, 2018 · 5 min read

Recently IBM API Connect v2018 has identified some “ severe performance degradation” when using GlusterFS. Please checkout the notes on IBM Knowledge Center.

Its the time to explore other storage options other than GlusterFS. One option is the Ceph Storage. In this story, we will explore Rook, which

runs as a cloud-native service for optimal integration with applications in need of storage, and handles the heavy-lifting behind the scenes such as provisioning and management.

Rook use the Kubernetes Operator pattern to automate and manage the storage for the cluster.

Lets see how I deployed Rook onto ICP 2.1.0.3.

Lets first add additional raw disks to the workers’ VM. As usual, I use the govc command line to manage vCenter.

govc vm.disk.create -ds=ICP_datastore -name ceph_disk_dev-work1 -size=100GB -vm dev-work1

New raw disk /dev/sdc is then created for the three worker nodes.

Lets clone the git repo first, git clone https://github.com/rook/rook.git

The rook operator yaml file located under rook/cluster/examples/kubernetes/ceph/operator.yaml

Deploy it with kubectl apply -f operator.yaml

After deploy the operator, the expected ceph-agent, rook-discover is not running.

$ kubectl -n rook-ceph-system get all
NAME READY STATUS RESTARTS AGE
pod/rook-ceph-operator-86776bbc44-vxsl6 1/1 Running 0 4h
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/rook-ceph-agent 0 0 0 0 0 <none> 3h
daemonset.apps/rook-discover 0 0 0 0 0 <none> 3h
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deployment.apps/rook-ceph-operator 1 1 1 1 4h
NAME DESIRED CURRENT READY AGE
replicaset.apps/rook-ceph-operator-86776bbc44 1 1 1 4h

Check the daemonset which has 0 pods running.

kubectl describe daemonset.apps/rook-ceph-agent -n rook-ceph-system

I found the below events,

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 19s (x101 over 4h) daemonset-controller Error creating: pods "rook-ceph-agent-" is forbidden: unable to validate against any pod security policy: [spec.securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed spec.containers[0].securityContext.hostNetwork: Invalid value: true: Host network is not allowed to be used]

In ICP 2103, since the Pod Security Policy is turned on by default, we need to assign the hostNetwork and privileged ClusterRole to the service account rook-ceph-system,

kind: ClusterRoleBinding
metadata:
name: sc-rook-ceph-system-privileged
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: privileged
subjects:
- kind: ServiceAccount
name: rook-ceph-system
namespace: rook-ceph-system

Save the above as a file, then run

kubectl apply -f rook-ceph-system_bind.yaml

Now the agent and discover pods are running,

$ k get -n rook-ceph-system pods
NAME READY STATUS RESTARTS AGE
rook-ceph-agent-6khsn 1/1 Running 0 15s
rook-ceph-agent-qj7sn 1/1 Running 0 15s
rook-ceph-agent-s8kdw 1/1 Running 0 15s
rook-ceph-operator-86776bbc44-248vj 1/1 Running 0 2m
rook-discover-bgsqn 1/1 Running 0 15s
rook-discover-dkk4k 1/1 Running 0 15s
rook-discover-gfp9x 1/1 Running 0 15s

Modify the rook/cluster/examples/kubernetes/ceph/cluster.yaml file. Use the node name and the new disk just added,

apiVersion: v1
kind: Namespace
metadata:
name: rook-ceph
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: rook-ceph-cluster
namespace: rook-ceph
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: rook-ceph-cluster
namespace: rook-ceph
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: [ "get", "list", "watch", "create", "update", "delete" ]
---
# Allow the operator to create resources in this cluster's namespace
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: rook-ceph-cluster-mgmt
namespace: rook-ceph
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: rook-ceph-cluster-mgmt
subjects:
- kind: ServiceAccount
name: rook-ceph-system
namespace: rook-ceph-system
---
# Allow the pods in this namespace to work with configmaps
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: rook-ceph-cluster
namespace: rook-ceph
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: rook-ceph-cluster
subjects:
- kind: ServiceAccount
name: rook-ceph-cluster
namespace: rook-ceph
---
apiVersion: ceph.rook.io/v1beta1
kind: Cluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
dataDirHostPath: /var/lib/rook
serviceAccount: rook-ceph-cluster
mon:
count: 3
allowMultiplePerNode: true
dashboard:
enabled: true
network:
hostNetwork: false
placement:
all:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: role
operator: In
values:
- storage-node
podAffinity:
podAntiAffinity:
tolerations:
- key: storage-node
operator: Exists
resources:
storage:
useAllNodes: false
useAllDevices: false
deviceFilter:
location:
config:
databaseSizeMB: "1024"
journalSizeMB: "1024"
nodes:
- name: "dev-worker1"
devices:
- name: "sdc"
- name: "dev-worker2"
devices:
- name: "sdc"
- name: "dev-worker3"
devices:
- name: "sdc"

Before the deploy, lets label the nodes for all the three worker nodes so that the rook will be only running on the worker nodes.

kubectl label node dev-worker1 role=storage-node
kubectl label node dev-worker2 role=storage-node
kubectl label node dev-worker3 role=storage-node

Deploy the cluster yaml file, kubectl apply -f cluster.yaml

Since the Rook job require the privileged pod, we need to bind the clusterrole to the service account, as what is shown in the following yaml file

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: rook-ceph-cluster-privileged
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: privileged
subjects:
- kind: ServiceAccount
name: rook-ceph-cluster
namespace: rook-ceph

Check all pods again, all the OSD are running now.

k  -n rook-ceph get pods
NAME READY STATUS RESTARTS AGE
rook-ceph-mgr-a-6fcd7c87c9-6g76j 1/1 Running 0 30s
rook-ceph-mon0-frnb4 1/1 Running 0 1m
rook-ceph-mon1-rqz9j 1/1 Running 0 52s
rook-ceph-mon2-nctft 1/1 Running 0 44s
rook-ceph-osd-id-0-779449457c-hxj89 1/1 Running 0 12s
rook-ceph-osd-id-1-567bd4468f-w5dqh 1/1 Running 0 10s
rook-ceph-osd-id-2-6849fd79df-x5d6d 1/1 Running 0 9s
rook-ceph-osd-prepare-dev-worker1-b2v24 0/1 Completed 0 26s
rook-ceph-osd-prepare-dev-worker2-lj527 0/1 Completed 0 26s
rook-ceph-osd-prepare-dev-worker3-z2btm 0/1 Completed 0 25s

Now we are ready to create the storageclass, create the following file,

apiVersion: ceph.rook.io/v1beta1
kind: Pool
metadata:
name: replicapool
namespace: rook-ceph
spec:
failureDomain: host
replicated:
size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
provisioner: ceph.rook.io/block
parameters:
pool: replicapool
clusterNamespace: rook-ceph
fstype: xfs

Deploy it, kubectl apply -f storage_class.yaml

As the last step, we can test PVC provisioning. Create the following PVC request file,

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-rook-ceph-test
spec:
storageClassName: rook-ceph-block
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi

Apply it, watch the PVC are bound.

In the case you need to redeploy the Rook, remove the Kubernetes object by reversing order against the creation.

You will also need remove the files and clean up the partition created by last deployment, on all the 3 worker nodes

sudo rm -rf /var/lib/rook/*sudo wipefs -a /dev/sdc