Kubernetes Mastery : Day 8 : How to Resolve it?

Prakhar Gandhi
Google Cloud - Community
3 min readJun 2, 2024

So, in this Article, i will be focusing and emphasizing on how to resolve issues and problems related to most of the issues;
For instance,
1. Monitoring Resource Utilization:

Hands-on Exercise: Set up Prometheus and Grafana for monitoring resource utilization in a Kubernetes cluster.

Steps:

  1. Install Prometheus and Grafana in your Kubernetes cluster.
  2. Configure Prometheus to scrape metrics from Kubernetes components (nodes, pods, API server).
  3. Create Grafana dashboards to visualize CPU, memory, disk I/O, and network metrics.
  4. Explore the dashboards and identify resource utilization patterns.

2. Identifying Resource Bottlenecks:

Hands-on Exercise: Use Prometheus and Grafana to identify a resource bottleneck in a sample application.

Steps:

  1. Monitor CPU and memory utilization metrics of pods using Grafana.
  2. Identify pods with consistently high resource usage.
  3. Check if these pods have appropriate resource requests and limits set.
  4. Analyze logs and events to pinpoint the cause of resource contention.

3. Optimizing Resource Requests and Limits:

Hands-on Exercise: Optimize resource requests and limits for a deployment based on observed usage patterns.

Steps:

  1. Analyze resource usage patterns of your application using monitoring tools.
  2. Adjust resource requests and limits for pods to match actual usage.
  3. Deploy the updated configuration and observe the impact on resource utilization.
  4. Repeat the process iteratively to fine-tune resource allocation.

4. Scaling Applications for Performance Improvement:

Hands-on Exercise: Set up Horizontal Pod Autoscaler (HPA) for a deployment and observe its behavior under varying workload.

Steps:

  1. Define resource utilization metrics (e.g., CPU) thresholds for autoscaling.
  2. Configure HPA for the deployment with appropriate scaling parameters.
  3. Generate load on the application to trigger autoscaling.
  4. Monitor HPA events and observe how it dynamically scales the number of pod replicas.

Now some, issues based on Real life examples and how to get it resolved in a go;

Scenario 1 : E-commerce Application Performance:

Scenario: An e-commerce application experiences slow response times during peak traffic hours due to CPU bottlenecks.

Task: Use HPA to automatically scale the number of pod replicas based on CPU utilization to handle increased traffic load.

Solution:
HPA automatically adjusts the number of pod replicas based on CPU utilization, ensuring that your application can handle increased traffic load efficiently. Here’s a step-by-step guide on how to set up HPA:

  1. Ensure Metrics Server is Installed: HPA relies on Metrics Server to gather metrics such as CPU utilization. Make sure Metrics Server is installed and running in your Kubernetes cluster.
  2. Set Resource Requests and Limits: Define appropriate CPU resource requests and limits for your application pods in the Kubernetes deployment or pod specification. This helps Kubernetes understand how much CPU each pod requires and limits excessive CPU usage.
  3. Create HorizontalPodAutoscaler: Define an HPA object that specifies the scaling behavior based on CPU utilization. You can specify the target CPU utilization percentage and the minimum/maximum number of replicas for your pods.
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: your-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: your-app-deployment
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
  1. Replace your-app-hpa with a suitable name for your HPA object, your-app-deployment with the name of your deployment, and adjust the averageUtilization value according to your requirements.
  2. Apply the HPA Configuration: Apply the HPA configuration to your Kubernetes cluster using kubectl apply.
kubectl apply -f your-hpa-config.yaml
  1. Monitor HPA: Monitor the HPA behavior using kubectl get hpa command. You can see the current CPU utilization, target utilization, and the number of replicas being managed by HPA.
  2. Test and Iterate: After applying HPA, simulate peak traffic or gradually increase traffic to your application to observe how HPA scales the number of pod replicas in response to CPU utilization. You may need to adjust the HPA configuration parameters based on your application’s behavior and performance requirements.

Scenario 2 : Database Performance Degradation:

Scenario: A database pod struggles to handle a sudden increase in read/write operations, impacting overall application performance.

Task: Implement custom metrics to monitor database performance indicators (e.g., query latency) and scale the database pod horizontally based on these metrics.

Solution : Again, this thing can be resolved with similar approach as used above;
but we just need to replace the type in metrics with External and specify our latency in the same way;
For Example;

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: database-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: database-deployment
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: custom_metric_query_latency
targetAverageValue: 100ms

Replace database-hpa with an appropriate name for your HPA object, database-deployment with the name of your database deployment, and adjust the targetAverageValue according to your performance thresholds.

And its done;
Hope you liked it;

--

--

Prakhar Gandhi
Google Cloud - Community

Google Developer Educator for Jetpack Compose | Google Cloud Innovator | Geek | Cybersecurity | Code | Strategy