Ceph S3 Object storage from Fluentd(EFK stack)

Vijay Nallagatla
Feb 15 · 3 min read

I find it hard to understand fluentd documentation and utilize Ceph storage (S3) to push Logs from Fluentd. This post helps to Store the Logs in Ceph’s S3 Object storage using Fluentd.

Ceph Storage with Rook

Follow the steps provided in Rook’s Github documentation for setting up Rook with Ceph storage.
https://github.com/rook/rook/blob/master/Documentation/ceph-quickstart.md

Setting Up EFK stack on Kubernetes cluster

Easiest way is to clone official kubernetes git repo
git clone https://github.com/kubernetes/kubernetes.git

Navigate to kubernetes/cluster/addons/fluentd-elasticsearch/ to find the deployment YAML’s for
* ElasticSearch (statefulset)
* Fluentd
* Kibana

cd kubernetes/cluster/addons/fluentd-elasticsearch/
kubectl create -f es-service.yaml
kubectl create -f es-statefulset.yaml
kubectl create -f fluentd-es-configmap.yaml
kubectl create -f fluentd-es-ds.yaml
kubectl create -f fluentd-es-image
kubectl create -f kibana-deployment.yaml
kubectl create -f kibana-service.yaml

Note: For Development/Testing purpose you can edit Kibana-service.yaml ‘type’ as NodePort to expose Kibana dashboard to access it outside the Cluster.

The out_s3 Output plugin writes records into the Amazon S3 cloud object storage service. By default, it creates files on an hourly basis

Fluentd’s out_s3 also provides support to AWS’s S3 Object storage implementations. Ceph Provides S3-compatible object storage functionality with an interface that is compatible with a large subset of the Amazon S3 RESTful API.

Installation

out_s3 is included in td-agent by default.

Note: Fluentd gem users will need to install the fluent-plugin-s3 gem. In order to install it, please refer to the Plugin Management article.

Example Configuration

This config will push all the logs of services running in cluster to Ceph’s S3 Object storage in json format

<match **>
@type s3
aws_key_id CEPH_S3_KEY_ID
aws_sec_key CEPH_S3_SECRET_KEY
s3_bucket CEPH_S3_BUCKET_NAME
s3_endpoint CEPH_S3_URL_WITH_STORE_NAME
path logs
# by default Objects are gZipped but you can store as json
store_as json
<buffer tag,time>
@type file
path /var/log/fluent/s3
timekey 3600 # 1 hour partition
timekey_wait 10m
timekey_use_utc true # use utc
chunk_limit_size 256m
</buffer>
</match>

You can connect to Ceph’s s3 using s3cmd tool
s3cmd:

sudo apt-get update
sudo apt-get install s3cmd

To Consume s3 storage

export AWS_HOST=<host>
export AWS_ENDPOINT=<endpoint>
export AWS_ACCESS_KEY_ID=<accessKey>
export AWS_SECRET_ACCESS_KEY=<secretKey>
  • Host: The DNS host name where the rgw service is found in the cluster. Assuming you are using the default rook-ceph cluster, it will be rook-ceph-rgw-my-store.rook-ceph.
  • Endpoint: The endpoint where the rgw service is listening. Run kubectl -n rook-ceph get svc rook-ceph-rgw-my-store, then combine the clusterIP and the port.
  • Access key: kubectl -n rook-ceph get secret rook-ceph-object-user-my-store-my-user -o yaml | grep AccessKey | awk ‘{print $2}’ | base64 — decode
  • Secret key: kubectl -n rook-ceph get secret rook-ceph-object-user-my-store-my-user -o yaml | grep SecretKey | awk ‘{print $2}’ | base64 — decode

s3cmd Listing files in S3_BUCKET

s3cmd ls s3://S3_BUCKET_NAME --no-ssl --host=$AWS_HOST

Summary

We have deployed EFK stack on Kubernetes and Rook with ceph storage cluster. Created Ceph Object store and used it in Fluentd Conf to connect to S3 using out-s3 Plugin. Access S3 storage using s3cmd Tool.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade