MinIO in Kubernetes (Minikube)

kurangdoa
5 min readAug 26, 2024

--

Intro

Nest for an ant is crucial because it kept their food safe and easy to access if needed. The same as ant-nest, the crucial part of data-lakehouse is object storage where we can store all of the data.

https://www.woodants.org.uk/species/biology

I decided to use minio because I want to stick with $0 cost of data lakehouse. To know better of Minio, you could read this link.

To be honest, I used minio because abundant articles to help me setup the connection between Minio and Iceberg. Also, Minio is S3 compatible object which makes it easier to switch in the future.

Deployment

If you haven’t create minikube cluster, please create your Minikube cluster explained in this link below

Now, there are files needed to be created which will be explained later. Let’s focus on the deployment first. You might want to see the structure and each file in this github repo. https://github.com/kurangdoa/lakehouse_iceberg/tree/main/psql

Simplified diagram can be seen below where stateful-set works together with headless-service so load balancer to know the pod target (Pod 1, Pod 2, or Pod 3).

Simplified DIagram

First, we need to create the namespace

kubectl delete namespace minio-dev
kubectl create namespace minio-dev

Afterward, we would need to create the PV and PVC

kubectl delete pv minio-volume-0
kubectl delete pv minio-volume-1
kubectl apply -f minio-pv.yaml -n minio-dev
kubectl apply -f minio-pvclaim.yaml -n minio-dev

The secrets need to be added as well in the cluster, in this tutorial, we will use “minio123” as the password. the output of the command below will be used inside minio-secret.yaml

echo -n ‘minio123’ | base64


kubectl apply -f minio-secret.yaml -n minio-dev

The main deployment will consist three part, Stateful-Set, Headless-Service, and Service deployment. Special part on this section is the headless-service which you can learn more in this link.

kubectl apply -f minio-sts.yaml -n minio-dev
kubectl apply -f minio-headless-service.yaml -n minio-dev
kubectl apply -f minio-service.yaml -n minio-dev

You could see if the deployment by typing the command below.

kubectl get all -n minio-dev

If you could see the service and pod running, you are good to go.

Now, you could go into your browser and type 127.0.0.1:6543. You could login with “minio” as username and “minio123” as the password.

For future deployment, you could create one bucket called “iceberg”

Detail Explanation

Let’s dive into the file one by one.

minio-secret.yaml’

The minio-password below is the base64 form of “minio123”

apiVersion: v1
kind: Secret
metadata:
name: minio-secret
type: Opaque
data:
minio-password: bWluaW8xMjM=

minio-pv.yaml

There is nothing special in this deployment file except the claimRef part which will be explained later.

apiVersion: v1
kind: PersistentVolume
metadata:
name: minio-volume-0
labels:
app: minio-app
spec:
storageClassName: minio-manual
claimRef:
name: data-datasaku-minio-0
namespace: minio-dev
capacity:
storage: 8Gi
accessModes:
- ReadWriteMany
hostPath:
path: /data/minio/minio0
---

apiVersion: v1
kind: PersistentVolume
metadata:
name: minio-volume-1
labels:
app: minio-app
spec:
storageClassName: minio-manual
claimRef:
name: data-datasaku-minio-1
namespace: minio-dev
capacity:
storage: 8Gi
accessModes:
- ReadWriteMany
hostPath:
path: /data/minio/minio1

minio-pvclaim.yaml

The metadata.name such as “data-datasaku-minio-0”has special pattern which will follow [spec.volumeClaimTemplates.metadata.name]-[metadata.name]-[ordinal based on replica] from minio-sts.yaml.

# why we name the metadata like below, https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/preexisting-pd

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-datasaku-minio-0
labels:
app: minio-app
spec:
storageClassName: minio-manual
volumeName: minio-volume-0
accessModes:
- ReadWriteMany
resources:
requests:
storage: 8Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-datasaku-minio-1
labels:
app: minio-app
spec:
storageClassName: minio-manual
volumeName: minio-volume-1
accessModes:
- ReadWriteMany
resources:
requests:
storage: 8Gi

minio-sts.yaml

Continue from the above, the [spec.volumeClaimTemplates.metadata.name] will be “data” and [metadata.name] will be “datasaku-minio” while the ordinal will follow the number of replica


apiVersion: apps/v1
kind: StatefulSet
metadata:
name: datasaku-minio
namespace: minio-dev
spec:
serviceName: minio
replicas: 2
selector:
matchLabels:
app: minio
template:
metadata:
annotations:
pod.alpha.kubernetes.io/initialized: "true"
labels:
app: minio
spec:
containers:
- name: minio
env:
- name: MINIO_ACCESS_KEY
value: "minio"
- name: MINIO_SECRET_KEY
valueFrom:
secretKeyRef:
name: minio-secret
key: minio-password
image: minio/minio
args:
- server
- http://datasaku-minio-{0...1}.minio.minio-dev.svc/data/minio
- --console-address # which port for console
- :9001
ports:
- containerPort: 9000
volumeMounts:
- name: data
mountPath: /data/minio

volumeClaimTemplates:
- apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data
namespace: minio-dev
spec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: "8Gi"
selector:
matchLabels:
app: minio-app
storageClassName: minio-manual

minio-headless-service.yaml

Due to deployment of stateful set, headless service is needed due to reason mentioned https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#limitations and https://stackoverflow.com/questions/52707840/what-is-a-headless-service-what-does-it-do-accomplish-and-what-are-some-legiti

apiVersion: v1
kind: Service
metadata:
name: minio
labels:
app: minio
namespace: minio-dev
spec:
clusterIP: None
ports:
- port: 9000
name: minio
selector:
app: minio

minio-service.yaml

It will expose the port to the minikube tunnel and you can access the GUI through 127.0.0.1:6543 and the api with 127.0.0.1:6544

apiVersion: v1
kind: Service
metadata:
name: minio-service
namespace: minio-dev
spec:
type: LoadBalancer
ports:
- name: api
port: 6544
targetPort: 9000
protocol: TCP
- name: web
port: 6543
targetPort: 9001
selector:
app: minio

Closing Remark

Deploy minio with statefulset on kubernetes is straightforward but needed special attention on what the PVC name will be. Now, you could play around with the minio according to your needs.

Links

https://github.com/minio/minio/issues/6775

--

--