CP4D-WKC (3.5.6) Installation Issue: Zookeeper Stuck

Introduction

IBM® Cloud Pak for Data (CP4D) is a cloud-native solution that enables you to put your data to work quickly and efficiently. Cloud Pak for Data lets you do both by enabling you to connect to your data, govern it, find it, and use it for analysis. Cloud Pak for Data also enables all your data users to collaborate from a single, unified interface that supports many services that are designed to work together.

Watson Knowledge Catalog (WKC) is one of the components in CP4D that provides a secure enterprise Catalog management platform that is supported by a data governance framework. A Catalog connects people to the data and knowledge that they need. The data governance framework ensures that data access and data quality are compliant with your business rules and standards.

This blog outlines an issue related to Zookeeper component which surfaced while installing CP4D 3.5.6 WKC (Cloud Pak for Data — Watson Knowledge Catalog) for one of the customers and the workaround required to get past the issue. The environment specification is mentioned in the Environment section below.

The blog aims to provide a prescriptive guide to be followed in case a similar issue be faced in such an environment.

The Environment

Platform

Red Hat Open Shift Container Platform (version — 4.6.39)

Cluster

  • 3 master nodes each having 8 CPU cores, 32 GB RAM
  • 6 worker nodes each having 16 CPU Cores, 64 GB RAM

Virtualization

VMWare VSphere (Client v7.0.2)

Cloud Pak

Cloud Pak for Data — Version 3.5.6

NFS

Nutanix Files on CentOS 7

Ownership

Exports Configuration

Open Shift Project

Name: cpd35

ID Ranges

Zookeeper Issue

Symptoms

iis module (release-name: 0072-iis) remained in Failed status as Zookeeper pod was not coming up. Due to this, downstream pods such as Kafka etc. remained in PodInitializing state.

Cause

Zookeeper pod was not coming up because of the following error:

/usr/bin/zkGenConfig.sh: line 276:

/var/lib/zookeeper/data/myid: Permission denied

/var/lib/zookeeper is mounted through an NFS based PV.

The pod script was able to create directories (data and log) inside the mount point; however, it was not able to write files inside those directories.

Workaround

Manually changed the permission of /data and /log directories in NFS Share location on NFS server. Fortunately, the export share was mounted on Bastion host and root had privilege to update the permissions.

So, workaround is to manually change the permissions of /data and /log in NFS share location for the PVC every time the new PVC is created which is when a new STS is created.

cd /var/nfsshare/cpd35-zookeeper-data-zookeeper-0-pvc-4a28baa2–8b11–4033–84de-c8b808753e33/

chmod 777 -R data/

chmod 777 -R log/

--

--