Enhancing Kubernetes security while deploying a cloud native social media application on GCP.

12 min readSep 26, 2023

In this project a popular social media application was experiencing scaling difficulties due to its rapid growth and increasing user base. The social media application initially started with a small user base, but as it gained popularity, more and more users started joining and actively using the platform. As a result of this, the application began to experience challenges in scaling its infrastructure to handle the growing number of users and their activities. One specific scaling difficulty that the application was facing is its server infrastructure. Initially, the application was hosted on a single server, but as the user base increased and the server started to struggle in scaling and handling the high volume of incoming requests. This lead to slow response times, frequent crashes and overall poor experience. To address this scaling difficulties, the company decided to adopt a distributed architecture. They introduced load balancers to distribute incoming traffic across multiple servers and implemented horizontal scaling by adding more servers to the infrastructure. This allowed the application to handle a larger number of concurrent users and improve its overall performance. However, as the user base continued to grow, the application faced a new challenge related to its database. The company decided to implement database sharding, where the database was partitioned into smaller shards. This approach helped to distribute the database load and improved query response times. Additionally, the application’s caching strategy was optimized to reduce the load on the database. By implementing an efficient caching mechanism, frequently accessed data such as user profiles, posts and images were stored in memory, reducing the need for repeated database queries. Despite all these efforts, the application still encountered occasional scrolling difficulties during peak usage periods, such as during major events or viral trends .

As a recent hire devops engineer by the company, i was given the responsibility to address this growing issue that was affecting this social media application. To solve this issue i will be leveraging KUBERNETES, an open source platform. I will migrate this application to a kubernetes cluster on google cloud, which will provide a scalable and resilient environment for managing containers. With kubernetes, the application can be divided into smaller self contained units called containers.Each container will encapsulates a specific component or service of the application, such as frontend, backend and database. I will be taking advantage of some features which kubernetes provides such as :

Built-in load balancing capabilities.
Self healing features: If a pod fails kubernetes automatically restarts or reschedules the affected containers to healthy nodes, minimizing downtime and maintaining the application’s availability.

This Project will be divided into multiple stages and we will go through the various steps in each stage.

Stage 1: This is where we will develop the social media application from scratch using flask and html, Dockerize the application application using docker, push the application to the Google container registry and finally deploy the application on a Kubernetes cluster in the Google Kubernetes engine.
Stage 2: This is the main focus of this project. In this stage we will go through steps to implement Kubernetes security hardening.
Stage 3: In this stage we will implement security automation for Kubernetes by writing automation scripts to enhance Kubernetes environment
Stage 4: We will get an insight on collaboration with cross functional teams to understand their specific requirements and concerns
Stage 5: This stage is where we will be testing and validating our security components by writing unit tests
Stage 6: In this stage we will be extending our security enhancements to our cloud provider (gcp)
Stage 7: Continuous monitoring

Stage 1: Create the cloud native social media application with a Phyton (Flask) backend and a HTML front end, Containerize it with docker and deploy it on GKE

Perquisites

GCP Account
Programmatic access and GCP configured with CLI
Phyton3 Installed
Docker and Kubectl installed
Flask Installed
Code editor (Vscode)

Step 1: Create our application from scratch

Create a new folder for our application.
Create a new file and name it with the .py extension
Install Python, Docker and Flask

Folder and app.py created. Python3 and docker installed.

Now we have the code for our application its time to run it on our local machine.
Run the command python3 app.py in your terminal and access the application on localhost:5000

app.py successfully runninng on our local machine

Type localhost:5000 in your browser url to access the application

App.py successfully running on our browser 🎉

Step 2 : Dockerize our application

Create a docker file in your code editor
Go to docker hub and select the python image version we will be using
Run docker build -t flask-social-media-app.
Once the build is successfully run docker images command to get a list of the images

Fantastic: Docker image successfully built

Next step is to run the image and create a container
Run docker run -p 5000:5000 image id

Our application is successfully running on a docker container🥳

type localhost:5000 on your browser to view the application running on a docker container.

Step 3: Create a container registry on google cloud and push our image to GCR.

Create a google cloud account.

Navigate to the GCR section and enable the api

Activate the google cloud shell
Use the upload button to upload our Cloud_native_Social_Media_Project folder containing all the files for our project.
Then run cd ~/Cloud_Native_Social_Media_Project to go inside the folder
Run the docker build -t gcr.io/<project-id>/python-image:tag on your cloud shell

Run docker images to check for images

We will be pushing our docker image via the gcp cloud shell as well
Run the docker push <repo-name>tag-name

Fantastic: Our image has been successfully pushed to the GCR🤩👏🏾

Step 4: Create a kubernetes cluster on GKE

Navigate to the kubernetes engine and enable the kubernetes engine api
We will be using the autopilot cluster creation to handle the scaling of our application and help control the cost.

Step 5: Deploy our container image from GCR to GKE.

Click on the deploy button and deploy your image
Fill in the details and select the image you want to deploy.

deployment name: nginx-1

Now confirm our application has successfully deployed by connecting our cluster to our cloud shell.
Run the kubectl get pods in your console.

Fantastic: We can see our pods up and running in our cloud shell

Now to access our deployment from the browser we have expose it to create a service.
select the port number in our docker file which is 5000

manifest file to expose our deployment service

Click on expose.

Click on the external ip address to view our application running on a kubernetes cluster

Application successfully running on our cluster in GKE 🥳

Congrats we have successfully built our application from scratch, containerized it with docker and deployed it in a kubernetes cluster on GKE!!!!!!!!🥳🤩🎉

Stage 2 : Implement Kubernetes security hardening.

Our social media application has been rapidly growing, resulting in increased traffic and a larger attack surface. Therefore creating a need to secure our Kubernetes cluster. Kubernetes isn’t safe by default, it has to be configured to better increase its security posture. To enhance our kubernetes cluster security we will be implement security in the following ways.

1. Etcd Security / Secret Management: Etcd is a highly available key-value store for cluster data including your secrets, so securing etcd is critical to cluster security. If someone obtains read or write access to etcd it is basically giving them root access to our cluster. So how do we do this?.

Encrypt etcd data at rest: Go to the kubernetes documentation and navigate to this section
Create an encryption configuration object.
Create a new file in your cluster folder and paste the code below from the kubernetes documentation

---
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
  - resources:
      - secrets
      - configmaps
      - pandas.awesome.bears.example
    providers:
      - aescbc:
          keys:
            - name: key1
              # See the following text for more details about the secret value
              secret: <BASE 64 ENCODED SECRET>
      - identity: {} # this fallback allows reading unencrypted secrets;
                     # for example, during initial migration

Generate a secrete.
Run the following command in your terminal

head -c 32 /dev/urandom | base64

This is going to generate an output. Copy the result and paste it in the secret section in our yaml file.
Next step is a to add the encryption provider config flag on our kube api server manifest file and have it point to the encryption configuration object we created. Add the folder as a volume and mount it

- --encryption-provider-congig=cd ~/K8-manifest.yaml

Once this is done its going to recreate the api server pod. When it comes back on any secret we create, it will be encrypted.
Verify that secret is encrypted.

kubectl create secret generic secret1 -n default --from-literal=mykey=mydata

Read the secret out of etcd.

ETCDCTL_API=3 etcdctl \
   --cacert=/etc/kubernetes/pki/etcd/ca.crt   \
   --cert=/etc/kubernetes/pki/etcd/server.crt \
   --key=/etc/kubernetes/pki/etcd/server.key  \
   get /registry/secrets/default/secret1 | hexdump -C

cat secret | hexdump -c

This should give an output like this

00000000  2f 72 65 67 69 73 74 72  79 2f 73 65 63 72 65 74  |/registry/secret|
00000010  73 2f 64 65 66 61 75 6c  74 2f 73 65 63 72 65 74  |s/default/secret|
00000020  31 0a 6b 38 73 3a 65 6e  63 3a 61 65 73 63 62 63  |1.k8s:enc:aescbc|
00000030  3a 76 31 3a 6b 65 79 31  3a c7 6c e7 d3 09 bc 06  |:v1:key1:.l.....|
00000040  25 51 91 e4 e0 6c e5 b1  4d 7a 8b 3d b9 c2 7c 6e  |%Q...l..Mz.=..|n|
00000050  b4 79 df 05 28 ae 0d 8e  5f 35 13 2c c0 18 99 3e  |.y..(..._5.,...>|
[...]
00000110  23 3a 0d fc 28 ca 48 2d  6b 2d 46 cc 72 0b 70 4c  |#:..(.H-k-F.r.pL|
00000120  a5 fc 35 43 12 4e 60 ef  bf 6f fe cf df 0b ad 1f  |..5C.N`..o......|
00000130  82 c4 88 53 02 da 3e 66  ff 0a                    |...S..>f..|
0000013a

We can also encrypt our etcd data by using Google Cloud KMS (this is the strongest and has the most benefit)

ii. Another way to secure etcd is to restrict access to etcd. This involves isolating our etcd server. We can achieve this by placing a firewall between etcd and api server.

2. Network policies : This restricts unnecessary namespaced, pod-to-pod access on the network or transport OSI layers.

Set an all out deny policy

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
   name: default-deny
   namespace: mynamespace
spec:
   podSelector: {}
   policyTypes: 
   - Ingress
   - Egress

Add allows as needed.

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
   name: mypolicy #policyname
   namespace: mynamespace #namespace
spec: 
   podSelector:
      matchLabels:
         role: db #applied to pods with this label
   policyTypes:
     - Ingress #ingress or egress
   ingress:
   - from:
     - podSelector:
         matchLabels:
            color: backend #allowed ingress from pods with this label
     ports:
       - protocols: TCP
         port: 5000 #port allowed

We can also use Istio which supports networking policies.

3. Pod to Pod Communication

This can also be done by introducing Istio to our cluster

4. Secure Access

Use RBAC to limit anonymous users
Disable insecure port
Enable a well-maintained RBAC
Corporate solution management: An open-source application like Teleport provides secure access to kubernetes clusters by way of short-lived Kube config files and certificates via single-sign-on
Teleport make sure that only authorized users can run certain commands like kubectl get pods by connecting teleport to your cluster via teleport console

5. Pod security

We can use some vulnerability scanner applications like kube-score to look for potential holes in our yaml and containers and try to find the problem before they are deployed to our ecosystem.
Security context below: Kubelet will refuse to start any image that defaults to the root user.

apiVersion: v1
kind: Pod
metadata:
   name: security-context-test
spec:
  securityContext:
    runAsNonRoot: true
  containers:
  - name: security-context-container
    image: busybox:1.28
    command: { "sh", "c", "sleep 1h"}

Stage 3: Security Automation for Kubernetes

We will be designing automation scripts using python to enhance the security of our kubernetes environment.
Automate routine security tasks, such as patch management, vulnerability scanning, and access control.
Create a workflow for automated incident response and remediation within Kubernetes clusters.

Below is a Python script that uses the Kubernetes Python client library (kubernetes package) to automate Kubernetes security configuration task: setting up RBAC (Role-Based Access Control) rules.

from kubernetes import client, config

def configure_rbac():
    try:
        # Load Kubernetes configuration from the default location (~/.kube/config)
        config.load_kube_config()
        
        # Initialize Kubernetes API client
        v1 = client.CoreV1Api()
        rbac_api = client.RbacAuthorizationV1Api()

        # Define RBAC rules
        namespace = 'default'
        role_name = 'example-role'
        role_binding_name = 'example-role-binding'
        api_group = 'extensions'
        resource = 'pods'
        verbs = ['get', 'list']

        # Create a Namespace if it doesn't exist
        namespace_obj = client.V1Namespace(metadata=client.V1ObjectMeta(name=namespace))
        v1.create_namespace(namespace_obj)

        # Create a Role
        role = client.V1Role(
            metadata=client.V1ObjectMeta(name=role_name),
            rules=[client.V1PolicyRule(api_groups=[api_group], resources=[resource], verbs=verbs)]
        )
        rbac_api.create_namespaced_role(namespace, role)

        # Create a RoleBinding
        role_binding = client.V1RoleBinding(
            metadata=client.V1ObjectMeta(name=role_binding_name),
            subjects=[client.V1Subject(kind='ServiceAccount', name='default', namespace=namespace)],
            role_ref=client.V1RoleRef(api_group='rbac.authorization.k8s.io', kind='Role', name=role_name)
        )
        rbac_api.create_namespaced_role_binding(namespace, role_binding)

        print("RBAC configuration applied successfully.")
    except Exception as e:
        print(f"An error occurred: {str(e)}")

if __name__ == "__main__":
    print("Starting Kubernetes security configuration automation...")

    try:
        configure_rbac()
    except ImportError:
        print("Please install the 'kubernetes' Python package to use this script.")
    except Exception as e:
        print(f"An error occurred: {str(e)}")

This script performs the following tasks:

Loads the Kubernetes configuration from the default location (~/.kube/config).
Initializes the Kubernetes API client.
Creates a Namespace .
Defines RBAC rules, including a Role and RoleBinding.
Applies RBAC configuration to the Kubernetes cluster.

Python script to perform etcd security:

import etcd3

def configure_etcd_security():
    try:
        # Define etcd endpoints and secure channel credentials
        etcd_endpoint = 'https://your-etcd-server:2379'
        etcd_cert_file = '/path/to/client.crt'
        etcd_key_file = '/path/to/client.key'
        etcd_ca_cert = '/path/to/ca.crt'

        # Connect to etcd securely using SSL/TLS
        etcd = etcd3.client(
            host=etcd_endpoint,
            ca_cert=etcd_ca_cert,
            cert_key=etcd_key_file,
            cert_cert=etcd_cert_file,
        )

        # Authenticate to etcd (e.g., using a username and password)
        username = 'your-username'
        password = 'your-password'
        etcd.user(username, password)

        # Define and create a new etcd key
        key = '/example/key'
        value = 'example-value'
        etcd.put(key, value)

        # Retrieve the value of the key
        retrieved_value, metadata = etcd.get(key)

        if retrieved_value == value:
            print("Etcd security configuration applied successfully.")
        else:
            print("Etcd security configuration failed.")

    except Exception as e:
        print(f"An error occurred: {str(e)}")

if __name__ == "__main__":
    print("Starting etcd security configuration automation...")

    try:
        configure_etcd_security()
    except ImportError:
        print("Please install the 'etcd3' Python package to use this script.")
    except Exception as e:
        print(f"An error occurred: {str(e)}")

This script performs the following function:

Establishes a secure connection to etcd using SSL/TLS certificates.
Authenticates to etcd using a username and password
Puts a key-value pair into etcd.
Retrieves and verifies the value of the key.

Stage 5: Testing and Validation

We will be writing comprehensive unit tests for the security components and enhancements using python.

import unittest
from kubernetes import client, config
from your_security_module import create_network_policy

class TestKubernetesSecurity(unittest.TestCase):

    def setUp(self):
        # Load Kubernetes configuration from the default location (~/.kube/config)
        config.load_kube_config()

    def test_create_network_policy(self):
        # Define the NetworkPolicy configuration
        policy_name = "test-network-policy"
        namespace = "default"
        pod_selector = {"matchLabels": {"app": "my-app"}}
        policy_types = ["Ingress"]
        ingress_rules = [
            {
                "from": [{"podSelector": {"matchLabels": {"role": "frontend"}}}],
                "ports": [{"protocol": "TCP", "port": 5000}],
            }
        ]

        # Create the NetworkPolicy
        created_policy = create_network_policy(
            name=policy_name,
            namespace=namespace,
            pod_selector=pod_selector,
            policy_types=policy_types,
            ingress_rules=ingress_rules,
        )

        # Retrieve the created NetworkPolicy from Kubernetes
        api_instance = client.NetworkingV1Api()
        retrieved_policy = api_instance.read_namespaced_network_policy(
            name=policy_name, namespace=namespace
        )

        # Assert that the created policy matches the retrieved policy
        self.assertEqual(created_policy.metadata.name, retrieved_policy.metadata.name)
        self.assertEqual(
            created_policy.spec.pod_selector, retrieved_policy.spec.pod_selector
        )
        self.assertEqual(created_policy.spec.policy_types, retrieved_policy.spec.policy_types)
        self.assertEqual(created_policy.spec.ingress, retrieved_policy.spec.ingress)

if __name__ == "__main__":
    unittest.main()

In this test:

We import the necessary modules and set up the Kubernetes configuration using config.load_kube_config() in the setUp method.
The test_create_network_policy method defines the desired NetworkPolicy configuration and calls the create_network_policy.Then it retrieves the created NetworkPolicy from Kubernetes.
The self.assertEqual assertions compare the expected and actual NetworkPolicy configurations to determine if they match.

Congrats, we have implemented the necessary stages to deploy a cloud native phyton(flask) application on gcp and enhance Kubernetes security within our cluster !!!🤩🎉🙏🏽.

Enhancing Kubernetes security while deploying a cloud native social media application on GCP.

Written by Joshua Oji