Enhancing Kubernetes security while deploying a cloud native social media application on GCP.

Joshua Oji
12 min readSep 26, 2023

--

In this project a popular social media application was experiencing scaling difficulties due to its rapid growth and increasing user base. The social media application initially started with a small user base, but as it gained popularity, more and more users started joining and actively using the platform. As a result of this, the application began to experience challenges in scaling its infrastructure to handle the growing number of users and their activities. One specific scaling difficulty that the application was facing is its server infrastructure. Initially, the application was hosted on a single server, but as the user base increased and the server started to struggle in scaling and handling the high volume of incoming requests. This lead to slow response times, frequent crashes and overall poor experience. To address this scaling difficulties, the company decided to adopt a distributed architecture. They introduced load balancers to distribute incoming traffic across multiple servers and implemented horizontal scaling by adding more servers to the infrastructure. This allowed the application to handle a larger number of concurrent users and improve its overall performance. However, as the user base continued to grow, the application faced a new challenge related to its database. The company decided to implement database sharding, where the database was partitioned into smaller shards. This approach helped to distribute the database load and improved query response times. Additionally, the application’s caching strategy was optimized to reduce the load on the database. By implementing an efficient caching mechanism, frequently accessed data such as user profiles, posts and images were stored in memory, reducing the need for repeated database queries. Despite all these efforts, the application still encountered occasional scrolling difficulties during peak usage periods, such as during major events or viral trends .

As a recent hire devops engineer by the company, i was given the responsibility to address this growing issue that was affecting this social media application. To solve this issue i will be leveraging KUBERNETES, an open source platform. I will migrate this application to a kubernetes cluster on google cloud, which will provide a scalable and resilient environment for managing containers. With kubernetes, the application can be divided into smaller self contained units called containers.Each container will encapsulates a specific component or service of the application, such as frontend, backend and database. I will be taking advantage of some features which kubernetes provides such as :

  • Built-in load balancing capabilities.
  • Self healing features: If a pod fails kubernetes automatically restarts or reschedules the affected containers to healthy nodes, minimizing downtime and maintaining the application’s availability.

This Project will be divided into multiple stages and we will go through the various steps in each stage.

  • Stage 1: This is where we will develop the social media application from scratch using flask and html, Dockerize the application application using docker, push the application to the Google container registry and finally deploy the application on a Kubernetes cluster in the Google Kubernetes engine.
  • Stage 2: This is the main focus of this project. In this stage we will go through steps to implement Kubernetes security hardening.
  • Stage 3: In this stage we will implement security automation for Kubernetes by writing automation scripts to enhance Kubernetes environment
  • Stage 4: We will get an insight on collaboration with cross functional teams to understand their specific requirements and concerns
  • Stage 5: This stage is where we will be testing and validating our security components by writing unit tests
  • Stage 6: In this stage we will be extending our security enhancements to our cloud provider (gcp)
  • Stage 7: Continuous monitoring

Stage 1: Create the cloud native social media application with a Phyton (Flask) backend and a HTML front end, Containerize it with docker and deploy it on GKE

Perquisites

  • GCP Account
  • Programmatic access and GCP configured with CLI
  • Phyton3 Installed
  • Docker and Kubectl installed
  • Flask Installed
  • Code editor (Vscode)

Step 1: Create our application from scratch

  • Create a new folder for our application.
  • Create a new file and name it with the .py extension
  • Install Python, Docker and Flask
Folder and app.py created. Python3 and docker installed.
kubectl also installed
Flask code and installation
  • Now we have the code for our application its time to run it on our local machine.
  • Run the command python3 app.py in your terminal and access the application on localhost:5000
app.py successfully runninng on our local machine
  • Type localhost:5000 in your browser url to access the application
App.py successfully running on our browser 🎉

Step 2 : Dockerize our application

  • Create a docker file in your code editor
  • Go to docker hub and select the python image version we will be using
  • Run docker build -t flask-social-media-app.
  • Once the build is successfully run docker images command to get a list of the images
Fantastic: Docker image successfully built
  • Next step is to run the image and create a container
  • Run docker run -p 5000:5000 image id
Our application is successfully running on a docker container🥳
  • type localhost:5000 on your browser to view the application running on a docker container.
Fantastic🤩

Step 3: Create a container registry on google cloud and push our image to GCR.

  • Create a google cloud account.
  • Navigate to the GCR section and enable the api
  • Activate the google cloud shell
  • Use the upload button to upload our Cloud_native_Social_Media_Project folder containing all the files for our project.
  • Then run cd ~/Cloud_Native_Social_Media_Project to go inside the folder
  • Run the docker build -t gcr.io/<project-id>/python-image:tag on your cloud shell
  • Run docker images to check for images
images are present in our cloud shell
  • We will be pushing our docker image via the gcp cloud shell as well
  • Run the docker push <repo-name>tag-name
Fantastic: Our image has been successfully pushed to the GCR🤩👏🏾

Step 4: Create a kubernetes cluster on GKE

  • Navigate to the kubernetes engine and enable the kubernetes engine api
  • We will be using the autopilot cluster creation to handle the scaling of our application and help control the cost.
cluster creation successfully.

Step 5: Deploy our container image from GCR to GKE.

  • Click on the deploy button and deploy your image
  • Fill in the details and select the image you want to deploy.
Kubernetes manifest files.
  • deployment name: nginx-1
image successfully deployed.
  • Now confirm our application has successfully deployed by connecting our cluster to our cloud shell.
  • Run the kubectl get pods in your console.
Fantastic: We can see our pods up and running in our cloud shell
  • Now to access our deployment from the browser we have expose it to create a service.
  • select the port number in our docker file which is 5000
manifest file to expose our deployment service
  • Click on expose.
Application successfully exposed
  • Click on the external ip address to view our application running on a kubernetes cluster
Application successfully running on our cluster in GKE 🥳

Congrats we have successfully built our application from scratch, containerized it with docker and deployed it in a kubernetes cluster on GKE!!!!!!!!🥳🤩🎉

Stage 2 : Implement Kubernetes security hardening.

Our social media application has been rapidly growing, resulting in increased traffic and a larger attack surface. Therefore creating a need to secure our Kubernetes cluster. Kubernetes isn’t safe by default, it has to be configured to better increase its security posture. To enhance our kubernetes cluster security we will be implement security in the following ways.

1. Etcd Security / Secret Management: Etcd is a highly available key-value store for cluster data including your secrets, so securing etcd is critical to cluster security. If someone obtains read or write access to etcd it is basically giving them root access to our cluster. So how do we do this?.

  • Encrypt etcd data at rest: Go to the kubernetes documentation and navigate to this section
  • Create an encryption configuration object.
  • Create a new file in your cluster folder and paste the code below from the kubernetes documentation
---
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
- configmaps
- pandas.awesome.bears.example
providers:
- aescbc:
keys:
- name: key1
# See the following text for more details about the secret value
secret: <BASE 64 ENCODED SECRET>
- identity: {} # this fallback allows reading unencrypted secrets;
# for example, during initial migration
  • Generate a secrete.
  • Run the following command in your terminal
head -c 32 /dev/urandom | base64
  • This is going to generate an output. Copy the result and paste it in the secret section in our yaml file.
  • Next step is a to add the encryption provider config flag on our kube api server manifest file and have it point to the encryption configuration object we created. Add the folder as a volume and mount it
- --encryption-provider-congig=cd ~/K8-manifest.yaml
  • Once this is done its going to recreate the api server pod. When it comes back on any secret we create, it will be encrypted.
  • Verify that secret is encrypted.
kubectl create secret generic secret1 -n default --from-literal=mykey=mydata
  • Read the secret out of etcd.
ETCDCTL_API=3 etcdctl \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
get /registry/secrets/default/secret1 | hexdump -C
  • Run
cat secret | hexdump -c
  • This should give an output like this
00000000  2f 72 65 67 69 73 74 72  79 2f 73 65 63 72 65 74  |/registry/secret|
00000010 73 2f 64 65 66 61 75 6c 74 2f 73 65 63 72 65 74 |s/default/secret|
00000020 31 0a 6b 38 73 3a 65 6e 63 3a 61 65 73 63 62 63 |1.k8s:enc:aescbc|
00000030 3a 76 31 3a 6b 65 79 31 3a c7 6c e7 d3 09 bc 06 |:v1:key1:.l.....|
00000040 25 51 91 e4 e0 6c e5 b1 4d 7a 8b 3d b9 c2 7c 6e |%Q...l..Mz.=..|n|
00000050 b4 79 df 05 28 ae 0d 8e 5f 35 13 2c c0 18 99 3e |.y..(..._5.,...>|
[...]
00000110 23 3a 0d fc 28 ca 48 2d 6b 2d 46 cc 72 0b 70 4c |#:..(.H-k-F.r.pL|
00000120 a5 fc 35 43 12 4e 60 ef bf 6f fe cf df 0b ad 1f |..5C.N`..o......|
00000130 82 c4 88 53 02 da 3e 66 ff 0a |...S..>f..|
0000013a
  • We can also encrypt our etcd data by using Google Cloud KMS (this is the strongest and has the most benefit)

ii. Another way to secure etcd is to restrict access to etcd. This involves isolating our etcd server. We can achieve this by placing a firewall between etcd and api server.

2. Network policies : This restricts unnecessary namespaced, pod-to-pod access on the network or transport OSI layers.

  • Set an all out deny policy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
namespace: mynamespace
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
  • Add allows as needed.
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: mypolicy #policyname
namespace: mynamespace #namespace
spec:
podSelector:
matchLabels:
role: db #applied to pods with this label
policyTypes:
- Ingress #ingress or egress
ingress:
- from:
- podSelector:
matchLabels:
color: backend #allowed ingress from pods with this label
ports:
- protocols: TCP
port: 5000 #port allowed
  • We can also use Istio which supports networking policies.

3. Pod to Pod Communication

  • This can also be done by introducing Istio to our cluster

4. Secure Access

  • Use RBAC to limit anonymous users
  • Disable insecure port
  • Enable a well-maintained RBAC
  • Corporate solution management: An open-source application like Teleport provides secure access to kubernetes clusters by way of short-lived Kube config files and certificates via single-sign-on
  • Teleport make sure that only authorized users can run certain commands like kubectl get pods by connecting teleport to your cluster via teleport console

5. Pod security

  • We can use some vulnerability scanner applications like kube-score to look for potential holes in our yaml and containers and try to find the problem before they are deployed to our ecosystem.
  • Security context below: Kubelet will refuse to start any image that defaults to the root user.
apiVersion: v1
kind: Pod
metadata:
name: security-context-test
spec:
securityContext:
runAsNonRoot: true
containers:
- name: security-context-container
image: busybox:1.28
command: { "sh", "c", "sleep 1h"}

Stage 3: Security Automation for Kubernetes

  • We will be designing automation scripts using python to enhance the security of our kubernetes environment.
  • Automate routine security tasks, such as patch management, vulnerability scanning, and access control.
  • Create a workflow for automated incident response and remediation within Kubernetes clusters.

Below is a Python script that uses the Kubernetes Python client library (kubernetes package) to automate Kubernetes security configuration task: setting up RBAC (Role-Based Access Control) rules.

from kubernetes import client, config

def configure_rbac():
try:
# Load Kubernetes configuration from the default location (~/.kube/config)
config.load_kube_config()

# Initialize Kubernetes API client
v1 = client.CoreV1Api()
rbac_api = client.RbacAuthorizationV1Api()

# Define RBAC rules
namespace = 'default'
role_name = 'example-role'
role_binding_name = 'example-role-binding'
api_group = 'extensions'
resource = 'pods'
verbs = ['get', 'list']

# Create a Namespace if it doesn't exist
namespace_obj = client.V1Namespace(metadata=client.V1ObjectMeta(name=namespace))
v1.create_namespace(namespace_obj)

# Create a Role
role = client.V1Role(
metadata=client.V1ObjectMeta(name=role_name),
rules=[client.V1PolicyRule(api_groups=[api_group], resources=[resource], verbs=verbs)]
)
rbac_api.create_namespaced_role(namespace, role)

# Create a RoleBinding
role_binding = client.V1RoleBinding(
metadata=client.V1ObjectMeta(name=role_binding_name),
subjects=[client.V1Subject(kind='ServiceAccount', name='default', namespace=namespace)],
role_ref=client.V1RoleRef(api_group='rbac.authorization.k8s.io', kind='Role', name=role_name)
)
rbac_api.create_namespaced_role_binding(namespace, role_binding)

print("RBAC configuration applied successfully.")
except Exception as e:
print(f"An error occurred: {str(e)}")

if __name__ == "__main__":
print("Starting Kubernetes security configuration automation...")

try:
configure_rbac()
except ImportError:
print("Please install the 'kubernetes' Python package to use this script.")
except Exception as e:
print(f"An error occurred: {str(e)}")

This script performs the following tasks:

  1. Loads the Kubernetes configuration from the default location (~/.kube/config).
  2. Initializes the Kubernetes API client.
  3. Creates a Namespace .
  4. Defines RBAC rules, including a Role and RoleBinding.
  5. Applies RBAC configuration to the Kubernetes cluster.

Python script to perform etcd security:

import etcd3

def configure_etcd_security():
try:
# Define etcd endpoints and secure channel credentials
etcd_endpoint = 'https://your-etcd-server:2379'
etcd_cert_file = '/path/to/client.crt'
etcd_key_file = '/path/to/client.key'
etcd_ca_cert = '/path/to/ca.crt'

# Connect to etcd securely using SSL/TLS
etcd = etcd3.client(
host=etcd_endpoint,
ca_cert=etcd_ca_cert,
cert_key=etcd_key_file,
cert_cert=etcd_cert_file,
)

# Authenticate to etcd (e.g., using a username and password)
username = 'your-username'
password = 'your-password'
etcd.user(username, password)

# Define and create a new etcd key
key = '/example/key'
value = 'example-value'
etcd.put(key, value)

# Retrieve the value of the key
retrieved_value, metadata = etcd.get(key)

if retrieved_value == value:
print("Etcd security configuration applied successfully.")
else:
print("Etcd security configuration failed.")

except Exception as e:
print(f"An error occurred: {str(e)}")

if __name__ == "__main__":
print("Starting etcd security configuration automation...")

try:
configure_etcd_security()
except ImportError:
print("Please install the 'etcd3' Python package to use this script.")
except Exception as e:
print(f"An error occurred: {str(e)}")

This script performs the following function:

  1. Establishes a secure connection to etcd using SSL/TLS certificates.
  2. Authenticates to etcd using a username and password
  3. Puts a key-value pair into etcd.
  4. Retrieves and verifies the value of the key.

Stage 5: Testing and Validation

  • We will be writing comprehensive unit tests for the security components and enhancements using python.
import unittest
from kubernetes import client, config
from your_security_module import create_network_policy

class TestKubernetesSecurity(unittest.TestCase):

def setUp(self):
# Load Kubernetes configuration from the default location (~/.kube/config)
config.load_kube_config()

def test_create_network_policy(self):
# Define the NetworkPolicy configuration
policy_name = "test-network-policy"
namespace = "default"
pod_selector = {"matchLabels": {"app": "my-app"}}
policy_types = ["Ingress"]
ingress_rules = [
{
"from": [{"podSelector": {"matchLabels": {"role": "frontend"}}}],
"ports": [{"protocol": "TCP", "port": 5000}],
}
]

# Create the NetworkPolicy
created_policy = create_network_policy(
name=policy_name,
namespace=namespace,
pod_selector=pod_selector,
policy_types=policy_types,
ingress_rules=ingress_rules,
)

# Retrieve the created NetworkPolicy from Kubernetes
api_instance = client.NetworkingV1Api()
retrieved_policy = api_instance.read_namespaced_network_policy(
name=policy_name, namespace=namespace
)

# Assert that the created policy matches the retrieved policy
self.assertEqual(created_policy.metadata.name, retrieved_policy.metadata.name)
self.assertEqual(
created_policy.spec.pod_selector, retrieved_policy.spec.pod_selector
)
self.assertEqual(created_policy.spec.policy_types, retrieved_policy.spec.policy_types)
self.assertEqual(created_policy.spec.ingress, retrieved_policy.spec.ingress)

if __name__ == "__main__":
unittest.main()

In this test:

  • We import the necessary modules and set up the Kubernetes configuration using config.load_kube_config() in the setUp method.
  • The test_create_network_policy method defines the desired NetworkPolicy configuration and calls the create_network_policy.Then it retrieves the created NetworkPolicy from Kubernetes.
  • The self.assertEqual assertions compare the expected and actual NetworkPolicy configurations to determine if they match.

Congrats, we have implemented the necessary stages to deploy a cloud native phyton(flask) application on gcp and enhance Kubernetes security within our cluster !!!🤩🎉🙏🏽.

--

--