How to list down the jobs running on Flink
Apache Flink is the open source that is being used for data streaming. it can handle batch and real-time data streaming, and have rich API libraries, it can scale horizontally adjusting the parallelism of operators, allowing it to handle large-scale data processing workloads.
K8s doesn't support Flink directly, so to handle all the requirements for Flink deployment new Custom Resource Definitions and Custom Resource can be created.
In Apache Flink, there are two primary modes for submitting and executing jobs: Flink Deployment and Flink Session Jobs.
Here is the link to set up the Flink cluster which will create flinkdeployments and flinksessionjobs in K8s. Once it is done, let's verify if the same with K8s command Kubectl api-resources | grep flink
With a simple Yaml manifest, I'll show how to create a flink deployment in Minikube.
apiVersion: flink.apache.org/v1beta1
kind: FlinkDeployment
metadata:
name: demo
labels:
flink-deployment: app-test
spec:
image: flink:1.16
flinkVersion: v1_16
jobManager:
resource:
memory: "1200m"
cpu: 0.3
taskManager:
resource:
memory: "1200m"
cpu: 0.3
serviceAccount: flink
Now port-forward the service to any host so that flink dashboard can be seen and command for this is :- kubectl port-forward svc/{flinkdeploymentname}-rest 8081:8081
Note:- for every flink deployment one service will be created with {flinkdeploymentname}-rest and it will have port 8081
No jobs are running for now, so create a YAML file with the kind “FlinkSessionJob” and while creating this YAML file mention deploymentName under the spec as shown below, so that the job will be run on a particular flinkdeployment
apiVersion: flink.apache.org/v1beta1
kind: FlinkSessionJob
metadata:
name: test
spec:
deploymentName: demo
job:
jarURI: https://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming_2.12/1.16.1/flink-examples-streaming_2.12-1.16.1-TopSpeedWindowing.jar
parallelism: 1
upgradeMode: stateless
Apply this Yaml to K8s with command kubectl apply -f filename.yaml
Now check the dashboard again, there will be one job running
To see more details about the job, there is an endpoint /jobs/overview
this is how we can list the jobs which are present on any flink deployment, they might be in a RUNNING or CANCELED state.
Conclusion:-
1. To get all the Jobs we can use k8s and list down all the flinksessionjobs which are been created for particular flinkdeployments.
2. Or inside the service all the jobs information are been stored from there either describe the service or use Flink dashboard to list down all the jobs
In the next part, I’ll show how we can do the same with the Gin framework using the Client-go library.