The Startup
Published in

The Startup

Build Ceph and Kubernetes Based Distributed File Storage System

In this article, we are going to build a Ceph and Kubernetes based distributed file storage system and integrate into our java platform.

Install

Download rook

Download the rook ceph GitHub code.

git clone --single-branch --branch release-1.2 https://github.com/rook/rook.git

Copy yaml

Copy the common.yaml, operator.yaml and toolbox.yaml files from ./rook/cluster/examples/kubernetes/ceph/.

cp ../rook/cluster/examples/kubernetes/ceph/common.yaml ../rook/cluster/examples/kubernetes/ceph/operator.yaml ../rook/cluster/examples/kubernetes/ceph/toolbox.yaml .

Create rook containers

Create the rook ceph containers, and wait the operator to be running status.

kubectl apply -f common.yaml
kubectl apply -f operator.yaml
kubectl get pod -n rook-ceph

Create volumes

Create 3 20G volumes and attach to 3 linux instances. Format the disks. Mkdir /mnt/ceph-storage and mount them all.

sudo fdisk -l
sudo mkfs.ext4 /dev/vdc
sudo mount /dev/vdb /mnt/ceph-storage -t auto
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 30G 19G 11G 65% /
devtmpfs 2.9G 0 2.9G 0% /dev
tmpfs 2.9G 12K 2.9G 1% /dev/shm
tmpfs 2.9G 298M 2.6G 11% /run
tmpfs 2.9G 0 2.9G 0% /sys/fs/cgroup
tmpfs 581M 0 581M 0% /run/user/1000
/dev/vdb 20G 45M 19G 1% /mnt/ceph-storage

Create Ceph cluster

Create cluster by the cluster.yaml. Notice the selected nodes have the disks attached. Each nodes/name field should match their kubernetes.io/hostname label.

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
dataDirHostPath: /var/lib/rook
mon:
count: 3
cephVersion:
image: ceph/ceph:v14.2.4-20190917
allowUnsupported: false
dashboard:
enabled: false
network:
hostNetwork: false
storage:
useAllNodes: false
useAllDevices: false
config:
metadataDevice:
databaseSizeMB: "1024" # this value can be removed for environments with normal sized disks (100 GB or larger)
journalSizeMB: "1024" # this value can be removed for environments with normal sized disks (20 GB or larger)
nodes:
- name: "slave.novalocal"
directories: # specific directories to use for storage can be specified for each node
- path: "/mnt/ceph-storage"
- name: "static.novalocal"
directories: # specific directories to use for storage can be specified for each node
- path: "/mnt/ceph-storage"
- name: "db.novalocal"
directories: # specific directories to use for storage can be specified for each node
- path: "/mnt/ceph-storage"

Notice the dashboard should be disabled first.

kubectl apply -f cluster.yaml
kubectl get pod -n rook-ceph --watch

Re-create operator and check osd

If error happens, remember to delete the folder dataDirHostPath on all nodes and delete the operator, then try again to create the cluster. Attach the operator pod to see the logs. When 3 mon and 3 osd pods are running it means the cluster is successfully created.

sudo rm -rf /var/lib/rook
kubectl delete -f operator.yaml
kubectl logs -f rook-ceph-operator-6b79d99f5c-9564s -n rook-ceph
kubectl get pod -n rook-ceph
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-265lj 3/3 Running 0 37m
csi-cephfsplugin-2b47v 3/3 Running 0 37m
csi-cephfsplugin-2dw6z 3/3 Running 0 37m
csi-cephfsplugin-2xgns 3/3 Running 0 37m
csi-cephfsplugin-54hd5 3/3 Running 0 37m
csi-cephfsplugin-9rgr8 3/3 Running 0 37m
csi-cephfsplugin-provisioner-5d999b68d6-8vc2k 4/4 Running 0 37m
csi-cephfsplugin-provisioner-5d999b68d6-xjg4z 4/4 Running 0 37m
csi-cephfsplugin-sp289 3/3 Running 0 37m
csi-rbdplugin-7gk9s 3/3 Running 0 37m
csi-rbdplugin-9sbwh 3/3 Running 0 37m
csi-rbdplugin-jwvxx 3/3 Running 0 37m
csi-rbdplugin-n9tcd 3/3 Running 0 37m
csi-rbdplugin-provisioner-69b7d7887-4l9l9 5/5 Running 0 37m
csi-rbdplugin-provisioner-69b7d7887-g48rb 5/5 Running 0 37m
csi-rbdplugin-q9q5b 3/3 Running 0 37m
csi-rbdplugin-snhqj 3/3 Running 0 37m
csi-rbdplugin-v6jb8 3/3 Running 0 37m
rook-ceph-crashcollector-db.novalocal-fd7dcc457-qd7tx 1/1 Running 0 11m
rook-ceph-crashcollector-deamon.novalocal-7789457f5-v6vnc 1/1 Running 0 11m
rook-ceph-crashcollector-slave.novalocal-5d698c7d7b-hgbnw 1/1 Running 0 9m59s
rook-ceph-crashcollector-static.novalocal-58f769ccc-lsf5b 1/1 Running 0 9m54s
rook-ceph-crashcollector-test.novalocal-6fdc8dbc4f-ksk75 1/1 Running 0 10m
rook-ceph-mgr-a-7898b59757-84tbd 1/1 Running 0 10m
rook-ceph-mon-a-7676f96769-5mhc6 1/1 Running 0 12m
rook-ceph-mon-b-79c9c9b59d-g82sc 1/1 Running 0 11m
rook-ceph-mon-c-7b679d7497-x2hjg 1/1 Running 0 11m
rook-ceph-operator-6b79d99f5c-9564s 1/1 Running 0 14m
rook-ceph-osd-0-5b59576bb4-rgsx8 1/1 Running 0 9m57s
rook-ceph-osd-1-74ff9d79c6-hpkc2 1/1 Running 0 9m55s
rook-ceph-osd-2-57748ff6bf-vf8jl 1/1 Running 0 9m59s
rook-ceph-osd-prepare-db.novalocal-jf84v 0/1 Completed 0 10m
rook-ceph-osd-prepare-slave.novalocal-j457l 0/1 Completed 0 10m
rook-ceph-osd-prepare-static.novalocal-jt4vj 0/1 Completed 0 10m
rook-discover-hwffd 1/1 Running 0 13m
rook-discover-jh8j5 1/1 Running 0 13m
rook-discover-mcdjn 1/1 Running 0 13m
rook-discover-mmzcb 1/1 Running 0 13m
rook-discover-mppgh 1/1 Running 0 13m
rook-discover-tz5gm 1/1 Running 0 13m
rook-discover-vj7tg 1/1 Running 0 13m

Create toolbox and test

Create toolbox pods by toolbox.yaml, and we can attach the pod then test the ceph status.

kubectl apply -f toolbox.yaml
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
ceph status
ceph osd status
ceph df
rados df
[root@rook-ceph-tools-787dc6b944-spsjh /]# ceph status
cluster:
id: 2095eaca-93b3-4365-a5c8-9b05269821a9
health: HEALTH_OK

services:
mon: 3 daemons, quorum a,b,c (age 26m)
mgr: a(active, since 26m)
osd: 3 osds: 3 up (since 25m), 3 in (since 25m)

data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 6.4 GiB used, 52 GiB / 59 GiB avail
pgs:

Enable dashboard

Modify the cluster.yaml to enable dashboard and apply to create the dashboard service.

vim cluster.yaml
kubectl apply -f cluster.yaml
kubectl get svc -n rook-ceph |grep mgr-dashboard
rook-ceph-mgr-dashboard ClusterIP 10.36.19.173 <none> 7000/TCP 66s

Create dashboard ingress

Now the dashboard service is ClusterIP mode which means we can only visit it in the cluster. Create dashboard Traefik from yaml file dashboard-ingress.yaml.

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: ceph-dashboard-ingress
namespace: rook-ceph
spec:
rules:
- host: dashboard.*.*
http:
paths:
- path: /
backend:
serviceName: rook-ceph-mgr-dashboard
servicePort: 7000

Now the dashboard could be visited publicly.

kubectl apply -f dashboard-ingress.yaml

Inspect dashboard password

Inspect the dashboard secret. The username is admin.

kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo

Create object gateway

Create object gateway by the object.yaml.

apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
name: my-store
namespace: rook-ceph
spec:
metadataPool:
failureDomain: host
replicated:
size: 3
dataPool:
failureDomain: host
replicated:
size: 3
preservePoolsOnDelete: false
gateway:
type: s3
sslCertificateRef:
port: 80
securePort:
instances: 1
placement:
annotations:
resources:

Check the pods after the creation.

kubectl apply -f object.yaml
kubectl -n rook-ceph get pod -l app=rook-ceph-rgw

Create radosgw user

Create a radosgw user in the toolbox.

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
radosgw-admin user create --uid=myuser --display-name=test-user --system
ceph dashboard set-rgw-api-user-id myuser
ceph dashboard set-rgw-api-access-key 32APIT3RA29JCO6OCR8P
ceph dashboard set-rgw-api-secret-key 2ioxTu6iBFkYP8UKiycS90A2DFwRBklSI8Bp3iPQ
{
"user_id": "myuser",
"display_name": "test-user",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"subusers": [],
"keys": [
{
"user": "myuser",
"access_key": "32APIT3RA29JCO6OCR8P",
"secret_key": "2ioxTu6iBFkYP8UKiycS90A2DFwRBklSI8Bp3iPQ"
}
],
"swift_keys": [],
"caps": [],
"op_mask": "read, write, delete",
"system": "true",
"default_placement": "",
"default_storage_class": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"temp_url_keys": [],
"type": "rgw",
"mfa_ids": []
}

Test S3 service in cluster

Connect the toolbox and test the object storage inside the cluster.

kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
yum --assumeyes install s3cmd
# The content is as bellow. The host is where the rgw service is listening. Run kubectl -n rook-ceph get svc rook-ceph-rgw-my-store, then combine the clusterIP and the port.
vi .s3cfg
s3cmd mb s3://test-bucket
s3cmd ls
[default]
access_key = Y14QX4KYOBCdvwMU6E5R
secret_key = AbeWMQPzpGhZPCMOq9IEkZSxLIgtooQsdvx4Cb4v
host_base = 10.100.191.33
host_bucket = 10.100.191.33/%(bucket)
use_https = False

Create S3 external service

Create the external service for the object store by using NodePort in the rgw-external.yaml.

apiVersion: v1
kind: Service
metadata:
name: rook-ceph-rgw-my-store-external
namespace: rook-ceph
labels:
app: rook-ceph-rgw
rook_cluster: rook-ceph
rook_object_store: my-store
spec:
ports:
- name: rgw
port: 80
protocol: TCP
targetPort: 80
selector:
app: rook-ceph-rgw
rook_cluster: rook-ceph
rook_object_store: my-store
sessionAffinity: None
type: NodePort

Cannot use Traefik here because it automatically redirect http to https which is not allowed in s3cmd.

kubectl apply -f rgw-external.yaml

Test S3 service outside cluster

Test the object storage outside the cluster. Remember to replace the credentials and endpoint.

# For windows we can download the s3cmd code from github and then "python s3cmd --configure" to save the configuration and then edit it with the below content.
# Run kubectl -n rook-ceph get service rook-ceph-rgw-my-store-external, then combine the node public ip and the external port as the host bellow.
# Run python s3cmd ls to test on Windows
[default]
access_key = Y14Qa4KYOBC83Vsev6E5R
secret_key = AbeWMQPzeGhZPCMOq9IEkZSLIsetooQcUfx4Cb4v
host_base = *:*
host_bucket = *:*/%(bucket)
use_https = False

Test S3 service in Java

Test the object storage with java code. Remember to replace the credentials and endpoint. The code works for both Amazon S3 and Ceph S3 except the conn part.

//The conn here is for Ceph S3
AWSCredentials credentials = new BasicAWSCredentials("***", "***");
ClientConfiguration clientConfig = new ClientConfiguration();
clientConfig.setProtocol(Protocol.HTTP);
AmazonS3 conn = AmazonS3Client.builder()
.withCredentials(new AWSStaticCredentialsProvider(credentials))
.withClientConfiguration(clientConfig) //Important for Ceph
.withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration("*:*", null)) //Important for Ceph
.enablePathStyleAccess() //Important for Ceph
.build();

//The conn here is for Amazon S3
//AWSCredentials credentials = new BasicAWSCredentials("***", "***");
//AmazonS3 conn = AmazonS3Client.builder()
// .withRegion("ap-northeast-1") //Important for Amazon
// .withCredentials(new AWSStaticCredentialsProvider(credentials))
// .build();

File file = new File("C:\Users\fanf\Pictures\test.jpg");
FileInputStream bais = new FileInputStream(file);
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentLength(file.length());
metadata.setContentType("image/jpg");
conn.putObject("test-bucket", "test.jpg", bais, metadata);
conn.setObjectAcl("test-bucket", "test.jpg", CannedAccessControlList.PublicRead);

ListObjectsRequest listObjectsRequest =
new ListObjectsRequest().withBucketName("test-bucket").withDelimiter("test-bucket/");
ObjectListing objects2 = conn.listObjects(listObjectsRequest);
Helper.println(objects2);

conn.setBucketPolicy("test-bucket", "public-read-write");
Bucket bucket2 = conn.createBucket("new-bucket");
ByteArrayInputStream input = new ByteArrayInputStream("Hello World!".getBytes());
conn.putObject(bucket2.getName(), "hello.txt", input, new ObjectMetadata());
conn.setObjectAcl(bucket2.getName(), "hello.txt", CannedAccessControlList.PublicRead);

List<Bucket> buckets = conn.listBuckets();
for (Bucket bucket : buckets) {
Helper.println(bucket.getName() + "\t" +
StringUtils.fromDate(bucket.getCreationDate()));
ObjectListing objects = conn.listObjects(bucket.getName());
do {
for (S3ObjectSummary objectSummary : objects.getObjectSummaries()) {
Helper.println(objectSummary.getKey() + "\t" +
objectSummary.getSize() + "\t" +
StringUtils.fromDate(objectSummary.getLastModified()));
}
objects = conn.listNextBatchOfObjects(objects);
} while (objects.isTruncated());
}

--

--

--

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +756K followers.

Recommended from Medium

The Ring is the Thing

API Economy

High performance mocking for load testing

Introduction to Azure Bicep

InvArch Project — Development Updates For November

My Favorite Little Code Pattern

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Frankie Fan

Frankie Fan

Researcher | Architect | Full-Stack | @hustakin

More from Medium

Canary Deployment in Kubernetes (Part 1) — Simple Canary Deployment using Ingress NGINX

Decrease your Organization’s Carbon footprints using Kubernetes

On-premise to cloud migration mock drills using Istio

Kubernetes on AWS — gotcha when using EFS