Security Fundamentals

Santosh P.
31 min readAug 23, 2024

--

Security is the most critical component in distributed architecture. Its intent to ensure a secure infrastructure, securing data and other workloads, securing services, allowing only the authorize user for accessing resources in any large scale distributed architecture. Most of the distributed system designed based on micro-service based model and micro-service applications has much larger attack surface than monolithic applications. So micro-services are susceptible to cyberattacks such as man-in-the-middle, injection attacks, cross-site scripting or DDoS and many more. That’s why microservice based architecture needed security at every layer and as a system designer we should have a basic understanding of security terminology.

Security is a very large and a complex topic. It’s hard to cover everything in a single blog. Here I have covered the basics, but as a reader if you are interested, you can dig deeper to have complete understanding of Security infrastructure.

Based on the assumption, I have divided the security insight blog into following sections. These are just the basics, for any advanced topic you can refer.

  • Common security Terminology which cover information about cryptography, key and tokens, etc.
  • Authentication and Authorization to Ensure users are who they say they are and have permission to access resources.
  • Encryption to encrypt data at rest and in transit to protect sensitive information.
  • Firewalls and Intrusion Detection for implement network security measures to detect and prevent unauthorized access. This you can find in system design of Load balancer.
  • Vulnerability and it’s mitigation : more information about vulneribility and the mitigation in a separate blog.

Common Security Terminology

Cryptography

Cryptography is the technique of securing information by transforming the request or data into a form that is unreadable by an unauthorized users. Cryptography ensure secure communications, protecting data integrity, and ensuring the authenticity of information in various applications, from online banking to secure messaging.

The key area, the Cryptography is applicable for are:

1. Key.

2. Tokens.

3. Hashing.

4. Digital signatures.

You will come across different cryptographic algorithm across this blog.

# Key

Key a cryptographically generated information used in various algorithms to perform encryption and decryption, as well as other cryptographic functions like signing and verifying data.

The security of cryptographic systems depends heavily on the secrecy and management of these keys. The basic workflow of key management are:

  1. Key generation
  2. Key storage
  3. Key distribution
  4. Key usability.
  5. Key rotation and expiry.

Effective key management is vital for maintaining the security of cryptographic systems. It ensures that keys are handled securely throughout their lifecycle, from generation to destruction, minimizing the risk of key compromise and ensuring compliance with security policies.

## Key Types

### Symmetric key: Symmetric key cryptography uses a single key for both
encryption and decryption. Random Number Generators (RNGs) generates symmetric key, such as the ones provided by cryptographic libraries (e.g., /dev/urandom in Unix-like systems).

The key length is crucial for security. Common symmetric key lengths are 128, 192, or 256 bits (as used in AES). Longer the length of security key, hard to exploit, even brute-force attacks against AES-256 significantly more difficult.

### Asymmetric Key: Asymmetric cryptography uses a key pair — a public key for encryption and a private key for decryption. Both parties involved
generating two large prime numbers, computing their product (n), and finding an exponent (e) that is relatively prime to (p-1)(q-1), where p and q are the prime numbers.

Asymmetric keys are generally longer than symmetric keys, with RSA keys often being 2048 or 4096 bits, and ECC keys being shorter (e.g., 256 or 384 bits) but providing equivalent security.

# Key generation:

Key generation ensure the creation of secure and unique keys that can be used for encryption, decryption, authentication, and more.

Key can be generated in many ways in many form with different length using different algorithm. But it all depends on what kind of key we are looking for. So let’s discuss about how some commonly used key gets generated.

## Symmetric Key with AES (Advanced Encryption standard):

AES key generation is a critical step in the encryption process, as the 
security of the encrypted data depends heavily on the quality and secrecy of
the key.

The key is a randomly generated binary string of a specific length (128, 192,
or 256 bits).

The first step in generating an AES key is to produce a random sequence of
bits. A Cryptographically Secure Pseudo-Random Number Generator (CSPRNG)
is often used to generate these random bits.

It's typically derived from sources of entropy, such as hardware-based random
number generators or operating system entropy pools. Low entropy sources can
result in predictable keys.

Depending on the desired level of security, the key length can be 128, 192,
or 256 bits. For example, in AES-128, a 128-bit key is used, which is 16 bytes
long. In AES-256, a 256-bit key is used, which is 32 bytes long.

There are many other ways the key can be generated, which have their own pros and cons. But AES is most widely used symmetric key.

- DES (Data Encryption Standard)
- 3DES (Triple DES)
- RC4
- Blowfish
- ChaCha20

## Asymmetric security algorithm with RSA

Public and private keys in asymmetric cryptography are generated as a pair
using complex mathematical algorithms. The process ensures that the keys are mathematically related but cannot be derived from one another easily.

RSA is the most common algorithm for generating these keys. RSA (Rivest-Shamir-Adleman) is a public-key cryptosystem used for secure data transmission.

RSA (Rivest-Shamir-Adleman)

Both public key and private key is based on two large prime numbers, p and q,
are selected. The product of these primes, n = p × q, forms the modulus for
both the public and private keys.

* Public Key: Composed of the modulus n and a public exponent e.
* Private Key: Consists of the modulus n and a private exponent d, which
is derived using p, q, and e.

- Select Two Large Prime Numbers (p and q)
- Compute the Modulus (n)
- Calculate the Totient (φ(n))
- Choose the Public Exponent (e)
- Calculate the Private Exponent (d)
- Form the Public and Private Keys

RSA keys are typically represented in several different formats, depending on their intended use.

PEM(Privacy-Enhanced Mail) is the most commonly used RSA key format. It is a Base64-encoded format wrapped with header and footer lines.

DER (Distinguished Encoding Rules) format is the binary format for RSA keys. The DER format does not have headers/footers because it is binary, not text.

The following example shows the way RSA key can be generated using golang.

package main

import (
"crypto/rand"
"crypto/rsa"
"fmt"
)
func main() {
// Generate RSA keys
privateKey, err := rsa.GenerateKey(rand.Reader, 2048)
if err != nil {
fmt.Println("Error generating RSA key:", err)
return
}
// Extract public key from the generated private key
publicKey := &privateKey.PublicKey
// Print the keys
fmt.Println("Private Key:", privateKey)
fmt.Println("Public Key:", publicKey)
}

Some other options of asymmetric key cryptography are:

ECC (Elliptic Curve Cryptography)
* Public Key: A point on the elliptic curve P, derived from the private
key and a base point G on the curve.
* Private Key: A random number d within a specific range.

DSA (Digital Signature Algorithm)
* Public Key: Derived from the private key using a public parameter (often
a generator value).
* Private Key: A random number within a given range.

# Key Storage.

After generating the AES key, it must be securely stored and managed to protect them from unauthorized access, tampering, or theft.

Keys often stored in secure hardware modules like Hardware Security Modules (HSMs) which is used to store the private keys for digital certific-ates in public key infrastructure (PKI) or secure enclaves. In software, keys might be stored in secure key vaults or protected memory areas. Vault is the most widely used tool for securely managing and storing secrets, such as security keys, API tokens, passwords, and certificates.

Vault has additional feature of encrypting data, ensuring that secrets like security keys are protected both at rest and in transit. Vault uses a policy-based access control system to enforce strict access controls on who can access or manage security key. Vault offers robust key management capabilities, including key generation, rotation, and revocation. Vault supports multiple secret engines, which are plugins that enable it to manage different types of secrets. The kv (key-value) secret engine is commonly used for storing static secrets like security keys, while it has other secret engines that can generate dynamic credentials for databases or cloud services.

package main

import (
"fmt"
"log"
vault "github.com/hashicorp/vault/api"
)
func main() {
// Initialize Vault client
config := vault.DefaultConfig()
config.Address = "http://127.0.0.1:8200"
client, err := vault.NewClient(config)
if err != nil {
log.Fatalf("Error creating Vault client: %v", err)
}
// Authenticate using Userpass method
userpassData := map[string]interface{}{
"password": "devpassword",
}
secret, err := client.Logical().Write("auth/userpass/login/dev_user", userpassData)
if err != nil {
log.Fatalf("Error authenticating with Vault: %v", err)
}
// Set the token received from authentication
client.SetToken(secret.Auth.ClientToken)
// Try to access a secret allowed by the developer policy
secretData, err := client.Logical().Read("secret/data/dev/example")
if err != nil {
log.Fatalf("Error reading secret: %v", err)
}
if secretData != nil {
fmt.Println("Secret data:", secretData.Data["data"].(map[string]interface{}))
} else {
fmt.Println("No secret found")
}
}

Vault supports automatic key rotation, which is crucial for maintaining the security of encryption keys over time.

vault "github.com/hashicorp/vault/api"

// Connect to Vault and Set Up Initial Key
func main() {
// Initialize the Vault client
config := vault.DefaultConfig()
config.Address = "http://127.0.0.1:8200"
client, err := vault.NewClient(config)
if err != nil {
log.Fatalf("Error creating Vault client: %v", err)
}
// Set Vault token
client.SetToken(os.Getenv("VAULT_TOKEN"))
// Store the initial key
initialKey := "supersecretkey123"
secretData := map[string]interface{}{
"key": initialKey,
}
_, err = client.Logical().Write("secret/data/my-app/encryption-key", secretData)
if err != nil {
log.Fatalf("Error writing secret to Vault: %v", err)
}
fmt.Println("Initial key stored in Vault")
}

#############

// key rotation logic

package main
import (
"crypto/rand"
"encoding/base64"
"fmt"
"log"
"os"
"time"
vault "github.com/hashicorp/vault/api"
)
func main() {
// Initialize the Vault client
config := vault.DefaultConfig()
config.Address = "http://127.0.0.1:8200"
client, err := vault.NewClient(config)
if err != nil {
log.Fatalf("Error creating Vault client: %v", err)
}
// Set Vault token
client.SetToken(os.Getenv("VAULT_TOKEN"))
// Rotate the key every 24 hours
for {
rotateKey(client)
time.Sleep(24 * time.Hour)
}
}
// rotateKey generates a new encryption key, stores it in Vault, and rotates the existing key
func rotateKey(client *vault.Client) {
// Generate a new key
newKey := generateRandomKey()
// Store the new key in Vault
secretData := map[string]interface{}{
"key": newKey,
}
_, err := client.Logical().Write("secret/data/my-app/encryption-key", secretData)
if err != nil {
log.Fatalf("Error rotating key in Vault: %v", err)
}
fmt.Println("Key rotated and stored in Vault")
}
// generateRandomKey creates a random 32-byte encryption key
func generateRandomKey() string {
key := make([]byte, 32)
if _, err := rand.Read(key); err != nil {
log.Fatalf("Error generating random key: %v", err)
}
return base64.StdEncoding.EncodeToString(key)
}

// To retrieve key from vault
secret, err := client.Logical().Read("secret/data/my-app/encryption-key")
if err != nil {
log.Fatalf("Error reading key from Vault: %v", err)
}
currentKey := secret.Data["key"].(string)
fmt.Println("Current key:", currentKey)

Vault keeps track of different versions of secrets, allowing you to roll back to a previous version if necessary. You can refer some example for more info.

# Key Distribution

Key distribution ensures the cryptographic keys securely shared between parties that need to communicate securely.

There are many key distribution methods available, like Public Key Infrastructure (PKI) used in SSL/TLS for secure web browsing, Diffie Hellman in SSL/TLS to securely establish session keys, KDC or key distribution centre of Kerberos which is an authentication protocol that uses symmetric key cryptography for secure key distribution.

Diffie–Hellman (DH) Algorithm: The Diffie–Hellman (DH) Algorithm is a key-exchange protocol that enables two parties communicate over public/insecured channel to share secret keys that can then be used for symmetric encryption of subsequent communications.

Both parties agree on a large prime number, p and on a generator, g, which is a primitive root modulo p. The generator g should have the property that its powers generate all numbers from 1 to p−1. These values p and g are public and can be shared over an insecure channel. Now the random private key will be generated for both client (a) and server (b) and ensure it’s never be shared.

Generate public key for client: A = g ^ a mod p

Generate public key for server: B = g ^ b mod p

These public keys A and B are shared with each other over the insecure channel. Now the shared secret will be calculated based on public and private key generated for client and server, and known to both client and server. Even though the public keys A and B are transmitted over an insecure channel, the computation of the shared secret from these public keys requires solving the discrete logarithm problem, which is computationally infeasible for large values of p and g.

This makes the Diffie-Hellman key exchange secure, as an eavesdropper cannot feasibly determine the shared secret.

# Key Usability

Keys commonly used for encrypting and decrypting plain text to cipher text. Encryption is the process that converts plaintext (readable data) into ciphertext (unreadable data) using an algorithm and an encryption key. The purpose of encryption is to protect the confidentiality of the data so that it cannot be read or understood by unauthorized parties.

Use the same key for both encryption and decryption. Basic workflow in
symmetric key. The following section shows the way plaintext to be encrypted and converted to ciphertext.

AES is a block cipher that encrypts data in fixed-size blocks. 

The key size determines the number of rounds of encryption, with 10 rounds
for 128-bit keys, 12 rounds for 192-bit keys, and 14 rounds for 256-bit keys.

* Substitution-Permutation Network: AES uses a combination of substitution
(replacing bytes with others) and permutation (shuffling bytes) to create
complex transformations.
* With key schedule original encryption key (128, 192, or 256 bits) is
expanded into a series of round keys, one for each round of encryption.
* To Mix the key with data, the plaintext block is XORed with the first
round key.

Main Rounds (9, 11, or 13 rounds):

Each round consists of four steps:
SubBytes: Each byte in the block is replaced with a corresponding byte from
a fixed substitution table (S-box). This introduces non-linearity to the
encryption.

ShiftRows: The rows of the block are shifted cyclically by different offsets.
This step ensures that columns of the block are mixed with each other.

MixColumns: The columns of the block are mixed using a linear transformation.
This step spreads the influence of each byte over the entire block.

AddRoundKey: The block is XORed with the round key generated in the Key
Schedule step.

Final Round:
The final round is similar to the main rounds but omits the MixColumns step.
The block undergoes SubBytes, ShiftRows, and AddRoundKey.
Output:

After the final round, the output is the ciphertext, a scrambled version of
the original plaintext.

Similarly the decryption process of AES is essentially the reverse of the
encryption process, using the same round keys but in reverse order. The steps
involved are:

* Inverse AddRoundKey
* Inverse MixColumns
* Inverse ShiftRows
* Inverse SubBytes

# Key Rotation and Key expiry

Key rotation is an essential practice in maintaining the security and integrity of cryptographic systems. By regularly updating encryption keys, organizations can reduce the risk of key compromise and ensure that their data remains protected. But key rotation has it’s own challenges:

Data Re-encryption: Re-encrypting large amounts of data can be resource-intensive and time-consuming. Planning is required to avoid impacting system performance.

Compatibility Issues: Older systems or applications may not support key rotation or may require updates to handle new keys.

Security Risks During Rotation: The process of key rotation itself can introduce security risks if not handled correctly. For example, if keys are exposed during distribution, the new key could be compromised.

The following example shows how key management can be implemented using golang. The workflow of key rotation is:

  • Generate a New Key
  • Re-encrypt Existing Data (Optional)
  • Update Key Storage
  • Update Encryption/Decryption Processes
  • Key Versioning
  • Securely Retire Old Key
 "crypto/aes"
"crypto/cipher"
"crypto/rand"
"crypto/sha256"
"encoding/base64"

// Generate a new AES key
func generateKey() ([]byte, error) {
key := make([]byte, 32) // AES-256 requires a 32-byte key
_, err := rand.Read(key)
if err != nil {
return nil, err
}
return key, nil
}

// Encrypt data with the given key
func encrypt(data string, key []byte) (string, error) {
block, err := aes.NewCipher(key)
if err != nil {
return "", err
}

gcm, err := cipher.NewGCM(block)
if err != nil {
return "", err
}

nonce := make([]byte, gcm.NonceSize())
if _, err := io.ReadFull(rand.Reader, nonce); err != nil {
return "", err
}

ciphertext := gcm.Seal(nonce, nonce, []byte(data), nil)
return base64.StdEncoding.EncodeToString(ciphertext), nil
}

// Decrypt data with the given key
func decrypt(data string, key []byte) (string, error) {
ciphertext, err := base64.StdEncoding.DecodeString(data)
if err != nil {
return "", err
}

block, err := aes.NewCipher(key)
if err != nil {
return "", err
}

gcm, err := cipher.NewGCM(block)
if err != nil {
return "", err
}

nonceSize := gcm.NonceSize()
nonce, ciphertext := ciphertext[:nonceSize], ciphertext[nonceSize:]
plaintext, err := gcm.Open(nil, nonce, ciphertext, nil)
if err != nil {
return "", err
}

return string(plaintext), nil
}

func main() {
// Step 1: Generate a new key
oldKey, err := generateKey()
if err != nil {
fmt.Println("Error generating key:", err)
return
}

// Step 2: Encrypt some data
data := "Sensitive information"
encryptedData, err := encrypt(data, oldKey)
if err != nil {
fmt.Println("Error encrypting data:", err)
return
}
fmt.Println("Encrypted data:", encryptedData)

// Step 3: Rotate the key
newKey, err := generateKey()
if err != nil {
fmt.Println("Error generating new key:", err)
return
}

// Step 4: Re-encrypt the data with the new key
decryptedData, err := decrypt(encryptedData, oldKey)
if err != nil {
fmt.Println("Error decrypting data:", err)
return
}
newEncryptedData, err := encrypt(decryptedData, newKey)
if err != nil {
fmt.Println("Error re-encrypting data:", err)
return
}
fmt.Println("New encrypted data:", newEncryptedData)
}

OpenSSL is a powerful tool used for generating cryptographic keys, certificates, and performing various other security-related tasks.

// Generate an RSA Key Pair. private_key.pem

openssl genpkey -algorithm RSA -out private_key.pem -aes256 -pass
pass:your_password

// generate public key.
openssl rsa -in private_key.pem -pubout -out public_key.pem -passin
pass:your_password
-------

// Generate an Elliptic Curve (EC) Key Pair
// Elliptic Curve Cryptography (ECC) provides a higher security level with
// smaller key sizes compared to RSA.
// private key

openssl ecparam -genkey -name prime256v1 -noout -out ec_private_key.pem

// public key
openssl ec -in ec_private_key.pem -pubout -out ec_public_key.pem

Security Tokens

When it comes to tokens, Security keys and security tokens are both used for authentication and securing access to systems, but they serve different purposes and function in distinct ways. A security key is a physical device used for two-factor authentication (2FA) or multi-factor authentication (MFA) with strong cryptography. The security key generates and stores a cryptographic key pair. During authentication, the public key is shared with the service, and the private key remains on the device, signing authentication requests.

A security token is a digital artifact used to authenticate a user or device. It can be a software token, a hardware token, or a session token, and it typically represents a user’s credentials or permissions in a system. Security tokens are used to authenticate users by validating the token against a server. Tokens are essential in security and authentication, serving various purposes such as granting access, maintaining sessions, or securely exchanging information.

Some of the structure of commonly used tokens are:

# API Token: In simple language, an API token also known as access token is a bunch of unique code bundled with every API and features user-specific information. It typically contains essential information that identifies the user or application and grants access to the services.

(Token ID or Prefix) type or scope of the token

(User or Application Identifier) unique identifier for the user, application, or client making the request.

(Encrypted random string): Hashed string

(Scope or permissions) token may include encoded information about the permissions or scope of access granted by the token.

JWT is a generic structure used by different tokens. API tokens are structured in JSON Web Tokens (JWTs), which have a defined structure.

## JWT Tokens:

The payload contains the claims. Claims are statements about an entity (typically, the user) and additional data. Claim can be a registered claim, public claim or private claim.

To create the signature part, you have to take the encoded header, the encoded payload, a secret, and the algorithm specified in the header, and sign that. If we sign the above data using a secret with HMAC SHA256, the resulting JWT might look like this:

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva
4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c

This token can now be sent to the client and can be used to verify the identity of the user or any other claims included in the payload. You can have non-jwt access token. The structure of non-jwt access token is:

  1. Token Identifier: A random string or GUID that uniquely identifies the token.
  2. User/Application ID: An identifier linking the token to a specific user or application.
  3. Expiration: Information about when the token expires, often stored in the server.
  4. Signature: A cryptographic signature to prevent tampering.

While non-JWT tokens might be simpler and sufficient for certain applications, JWT tokens offer some advantages, making them more suitable for modern, distributed, and scalable applications.

JWT tokens are self-contained, meaning they include all the necessary information (claims) about the user or session within the token itself. This eliminates the need to storing session data on the server, allowing for stateless authentication. JWT tokens can be signed, ensuring that the token’s contents have not been tampered with. The signature also allows the server to verify the authenticity of the token. JWT tokens are stateless, so they are easier to scale horizontally. Servers don’t need to share session state, reducing the need for centralized session stores or sticky sessions. Finally JWT tokens decouple authentication and authorization from the underlying infrastructure. This is particularly useful in microservices architectures, where services can independently verify the token without needing to communicate with a central authentication service.

More information about JWT access token can be found over here.

API keys define the source of the requesting entity, whereas API tokens 
identify the user and their rights.


NOTE:
How to identify a user, device, or services in a computing environment.

Secure keys to identify that a device to a specific entity. The value provided
by code when calling an API to identify and authorize the caller.
It serves as an identifier that can connect to other security layers, like
encryption/decryption, identification routines, and rate-limiting approaches.

Security tokens used for identify digital or physical objects that are
employed to authenticate a user's identity when accessing systems, applications
, or services. A piece of data that represents a user session or specific
privileges. Used by individual users for a limited period of time.

Tokens are more secure then api keys, as api keys are long-lived and can be
compromized, it's only get rotated.

Certificates

A digital certificate is a file that uses a digital signature to bind a public key with an identity, such as the name of a person, organization, or device that certifies the ownership of a public key. Certificates are an essential component of public key infrastructure (PKI), enabling secure communication, authentication, and data integrity across networks, typically in SSL/TLS protocols.

The most common standard for digital certificates is X.509.

Key Components of a Certificate: 

Subject:

The entity (person, organization, or device) identified by the certificate.
This typically includes details like the common name (CN), organization (O),
organizational unit (OU), and country (C).

Issuer:

The certificate authority (CA) that issued and signed the certificate. The
CA's role is to validate the identity of the subject and ensure the integrity
of the certificate.

Public Key:

The public key that belongs to the subject. This key is used by others to
encrypt data or verify a digital signature.

Serial Number:
A unique identifier assigned by the issuer to the certificate. It helps
distinguish the certificate from others issued by the same CA.

Validity Period:
The time frame during which the certificate is valid. It includes a start
date (valid from) and an end date (valid until). After the end date, the
certificate is considered expired.

Signature Algorithm:
The algorithm used by the CA to sign the certificate, ensuring its authenticity
. Common algorithms include SHA-256 with RSA encryption.

Signature:

The digital signature created by the CA using its private key. This signature
can be verified using the CA's public key to ensure the certificate has not been tampered with.

Extensions (Optional):

Additional information that may include usage restrictions, alternate subject
names, or other attributes related to the certificate's usage.

Hashing

Hashing is a process of cryptography that transforms input data of any size into a fixed-size string of characters, typically a hexadecimal number. It uses the hash function, which is a mathematical algorithm designed to produce a unique output for a given input. Common Hash functions:

MD5 (Message Digest Algorithm 5): Produces a 128-bit hash value, often represented as a 32-character hexadecimal number. Though widely used, MD5 is considered cryptographically broken due to vulnerabilities that allow hash collisions.

SHA-1 (Secure Hash Algorithm 1): Produces a 160-bit hash value, represented as a 40-character hexadecimal number. Like MD5, SHA-1 has been found to have vulnerabilities and is not recommended for secure applications.

SHA-256 (Secure Hash Algorithm 256-bit): Part of the SHA-2 family, it produces a 256-bit hash value and is widely used in security applications, including blockchain technology.

SHA-3: The latest member of the Secure Hash Algorithm family, offering a different internal structure from SHA-2 and providing additional security properties.

Some of the key applications where these hash functions are being used are:
1) (Data integrity) Hashing ensures that data has not been altered. By comparing the hash values before and after transmission or storage, you can verify that the data remains unchanged.

2) (Securing Password) When a user logs in, their password is hashed, and the resulting hash is compared to the stored hash.

3) (Digital Signatures) Hashing is used in digital signatures to create a unique representation of a message, ensuring its authenticity and integrity.

Input: "hello"
SHA-256 Hash: 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

Digital Signatures

Digital signatures are a cryptographic mechanism used to verify the authenticity and integrity of digital messages, documents, or software. They serve as a digital equivalent of a handwritten signature or a stamped seal, but they offer far more security and are based on mathematical algorithms.

Signing Process:

  • The signer creates a hash (a fixed-size string of characters) of the message or document using a hash function (like SHA-256).
  • The hash is then encrypted with the signer’s private key, creating the digital signature.
  • The digital signature is appended to the document or message.

Verification Process:

  • The recipient decrypts the digital signature using the signer’s public key to retrieve the original hash.
  • The recipient also creates a new hash of the received message or document using the same hash function.
  • If the decrypted hash matches the newly computed hash, the signature is valid, indicating that the message has not been altered and is indeed from the signer.

Binary-To-Text Encoding Scheme

Binary-to-text encoding schemes are methods used to represent binary data (which consists of bytes) as text, typically in a format that can be easily transmitted over media that handles text data. These encoding schemes ensure that binary data is safely encoded in a text format that won’t be corrupted or misinterpreted by systems that expect textual data.

The most commonly used encoding scheme is the Base64, which encode binary data (like images or files) into a text format that can be safely transmitted over media that only supports text content, such as email or JSON. Base64 also encode data for URL parameters, or for embedding in web pages or scripts, to ensure the data remains intact without corruption. Base64 can also be used for encoding username and password, which are then sent as a header in HTTP requests.

Base 64 encoding works on binary data, which is essentially a series of bytes. 
Each byte consists of 8 bits. Base64 groups the input bits into chunks of 6
bits each.

Text: Hello
Binary form: 01001000 01100101 01101100 01101100 01101111
Chunks of 6 bits each: 010010 000110 010101 101100 011011 001101 011011 011011

Each 6-bit group is then converted into a decimal value (ranging from 0 to 63)
and mapped to a corresponding Base64 character from the following set:

A-Z (0-25), a-z (26-51), 0-9 (52-61), + (62), / (63)

The encoded string for "Hello" after converting the 6-bit groups is:
`SGVsbG8=.`

padding at the end (=), which is added to make the final encoded string’s
length a multiple of 4.

The decoding process reverses the steps: converting Base64 characters back
to 6-bit binary groups, combining them to form the original 8-bit bytes, and
finally restoring the original binary data.

Authentication and Autherization

Authentication means verifying the identity of a client. The following section shows the authentication procedure:

1) Credential submission either using username/password, token or digital certificates.

2) Credential transmission: The credentials are sent from the user’s device to the authentication server. This transmission should be secure, typically using protocols like HTTPS to encrypt the data.

3) Credential Verification: The server receives the credentials and compares them against the stored data to verify their authenticity. It can be token verification or certificate verification that signed by trusted certificate authority and validated.

4) Authentication Decision

5) Session Creation: communication channel between user and servers.

OpenID connect

OpenID connect is an authentication protocol to find the identity and authenticate user to their services. OpenID Connect (OIDC) extends OAuth 2.0 by adding an identity layer, thus combining authorization and authentication in a single protocol. Features like token revocation, token introspection, and PKCE (Proof Key for Code Exchange) in authorization code flow provide enhanced security.

Authorization is the way to determine if a user or system has the right to access a resource or perform a specific action. It’s a critical part of security in software systems, ensuring that users only have access to the data and functionalities they’re permitted to use.

Open Authorization or OAuth:

OAuth (Open Authorization) is a protocol that allows third-party applications to access user data without exposing user credentials, like passwords. The data in the server can be accessed using well known api, for which the access need to be granted. oAuth2.0 with the following flow authorize applications to access APIs on behalf of a user.

More information about oAuth2 can be found here.

OAuth 2.0 allows applications to obtain limited access to user resources, but OAuth 2.0 doesn’t specify how to authenticate a user, meaning it doesn’t verify the identity of the user who is granting access. With OpenID connect, it allows the application to verify the user’s identity and obtain basic profile information. It enables the use of OAuth 2.0 for not just authorizing access to resources but also authenticating users. Open ID introduces identity token, a standardized token (usually a JWT) that contains user identity information.

Kerberos

Kerberos is a network authentication and authorization protocol designed to provide secure authentication for users and services in a distributed network environment. It uses secret-key cryptography and a trusted third party (Key Distribution Center or KDC) to authenticate users and services.

By using strong cryptography and a trusted third party, Kerberos ensures that both users and services can trust each other, and it minimizes the risk of unauthorized access.

For implementing Kerberos in distributed architecture, you need to make following requests before being get accessed to a server:

  1. The client requests a Ticket Granting Ticket (TGT) from the Authentication Server (AS).
  2. The client requests a service ticket from the Ticket Granting Server (TGS) using the TGT.
  3. The client uses the service ticket to authenticate to the application server.
package main

import (
"fmt"
"log"
"gopkg.in/jcmturner/gokrb5.v7/client"
"gopkg.in/jcmturner/gokrb5.v7/config"
"gopkg.in/jcmturner/gokrb5.v7/keytab"
"gopkg.in/jcmturner/gokrb5.v7/spnego"
"gopkg.in/jcmturner/gokrb5.v7/credentials"
"net/http"
)

func main() {
// Load the Kerberos configuration
krbConf, err := config.Load("/etc/krb5.conf")
if err != nil {
log.Fatalf("could not load krb5.conf: %v", err)
}

// Load the keytab file for the service
kt, err := keytab.Load("/path/to/service.keytab")
if err != nil {
log.Fatalf("could not load keytab: %v", err)
}

// Create a Kerberos client
krbClient := client.NewWithKeytab("user@YOUR.REALM", "YOUR.REALM", kt, krbConf)

// Login to get a TGT
err = krbClient.Login()
if err != nil {
log.Fatalf("Kerberos login failed: %v", err)
}

// Prepare the HTTP client with SPNEGO (Simple and Protected GSS-API Negotiation Mechanism)
spnegoClient := spnego.NewClient(krbClient, nil, "")

// Send a request to a service that requires Kerberos authentication
req, _ := http.NewRequest("GET", "http://your.service.example.com", nil)
resp, err := spnegoClient.Do(req)
if err != nil {
log.Fatalf("failed to authenticate: %v", err)
}
defer resp.Body.Close()

// Process the response
fmt.Printf("Response code: %v\n", resp.StatusCode)
}

Note, these are some of the important steps you need to follow when implementing Kerberos:
1) Implement thorough error checks, especially around Kerberos ticket acquisition and HTTP request handling.

2) Ensure that your application handles sensitive data securely with keytabs and credentials.

Access Control

When it comes to access control, there are multiple ways it can be performed, either by using ACL (access control list) where you can defined whitelist (To allow) and blacklist (To deny) or else using Role model.

In case of role model, the system retrieves the roles or permissions associated with the authenticated user. This could be from a database, an external service, or a configuration file. We have different model that helps to evaluate whether the user’s roles or permissions allow them to perform the requested action or access the requested resource.

  1. User identity
  2. Identify user.
  3. Access control with IAM.

API gateway authenticate and autherized each incoming requests. API gateway in making allow/deny decisions. API Access Management product offers custom authorization servers, which allow you to adjust for audience parameters, custom scopes, and access policies.

1. For authentication IAM policies.

2. Access control using public/private key or using JWT token.

3. Secure API servers with RBAC.

Role-Based Access Control (RBAC): Access is granted based on the roles assigned to a user. For example, an “Admin” role might have full access, while a “User” role has restricted access. Role can be assigned to individual client or a group.

Policy-Based Access Control (PBAC): Access is controlled based on policies that define rules for who can access what under specific conditions.

Mandatory Access Control (MAC): Access is enforced by the system according to a strict policy, often used in government or military systems.

If the user has the necessary permissions, access is granted, and the requested action is allowed. If not, the system denies access and might return an appropriate error message or status code (e.g., HTTP 403 Forbidden).

Similarly in case of using Access Control List (ACL), where you can use a set of rules that define what actions different users or systems can perform on specific resources, such as files, directories, or network devices. Access Control Lists or ACLs are used to define what operations (e.g., read, write, execute) specific users or groups can perform on a resource. Access control of directory can be done using LDAP and for files and network devices can be done using whitelisting and black listing model.

LDAP (Lightweight Directory Access Protocol) protocol used to access and manage directory information, such as user credentials and organizational data. Access control in LDAP is managed through a combination of Access Control Lists (ACLs) and security mechanisms built into the directory server.

Identity and Access Management

Finally as a security professional, when we want a single framework that addresses identity, authentication, authorization and access management as whole, it comes Identity and access management(IAM). Identity and Access Management (IAM) manage digital identities and control access to resources. IAM workflows involve processes for creating, managing, and securing user identities, along with assigning and enforcing access rights.

IAM workflow addressed identity collection in the form of username and password, Storing the identity in a directory service (e.g., Active Directory, LDAP), user authentication (whether single sign-on, or multi-factor authentication), authorization (user’s permissions to determine what resources they can access using RBAC, PBAC), provisioning of resources, monitoring and auditing and finally user identity management (like user access reviewing) done in single framework.

So IAM has the components, like

  • User Directory: A database of user identities, often implemented using LDAP or Active Directory.
  • Authentication Server: Handles the authentication process, possibly using protocols like Kerberos or SAML.
  • Access Management: Enforces access policies, often integrated with the application or service requesting access.
  • Audit and Monitoring Tools: Tracks user activity and enforces compliance with security policies.

IAM provided by all cloud vendors as a separate service with service accounts. A service account is a type of account used by applications, services, or virtual machines (VMs) to interact with other services and resources in a secure and controlled manner. Unlike a regular user account, a service account is not tied to a human user but is instead intended for machine-to-machine interactions.

Service accounts often have specific, limited permissions tailored to the tasks the associated service needs to perform. Service accounts are typically isolated from user accounts to ensure that actions performed by services are auditable and traceable separately from those performed by users. Service accounts can be assigned roles that define what resources they can access and what actions they can perform. This allows administrators to precisely control the level of access granted.

Service accounts are a crucial component of secure, automated, and scalable cloud environments, enabling services to interact with each other and access necessary resources in a controlled and secure manner.

Securing Network

By encrypting data in transit we can able to maintain data privacy and security of data in various applications. Encryption scramble data so that only authorized parties can understand the information. It is the process of converting human-readable plaintext to incomprehensible text, known as ciphertext.

TLS provides a robust framework for securing network communications, ensuring that data transmitted between clients and servers remains private and protected against tampering and eavesdropping. The encrypted data need to be sent over the https. To use the https, you need the certificate. It grants permission to use encrypted communication via public key infrastructure (PKI) and also authenticates the identity of the certificate holder.

How TLS works ?

# TLS Handshake

The TLS handshake is the initial process where the client and server establish a secure connection.

# Session Encryption

Once the handshake is complete, both parties use the derived session keys to encrypt and decrypt the data transmitted during the session. This ensures the confidentiality and integrity of the data.

Session keys are the temporary symmetric keys used to encrypt data during a communication session. These keys are typically generated for the duration of a session and then discarded. More information about session later. It can be of types: TLS and VPN session keys.

Based on the type of keys, there are two different types of encryption. Symmetric key encryption and asymmetric key encryption, with Symmetric-key encryption, the message is encrypted by using a key and the same key is used to decrypt the message. Asymmetric Key Encryption is based on public and private key encryption techniques.

# Session resumption

To improve performance, TLS supports session resumption. This allows a client and server to reuse previously established session parameters instead of performing a full handshake again. This can be done using session IDs or session tickets.

# Session/TLS termination

When the communication is finished, either the client or server can terminate the session. Mostly the TLS termination being done by API gateway. After sending “close_notify” alert message, which indicates that no more data will be sent.

All microservice based architecture user browser connected to web services using either http/https protocol. Https is the secured version of http protocol that uses the port no 443 over TCP. HTTPs transported the encrypted packet over the TCP/IP networks. With Transport Layer Security (TLS), HTTPS ensures privacy and data integrity for network traffic by encrypting communication over HTTP. The key components in TLS that ensure secure data in transit are:

  • Certificates and Public Key Infrastructure (PKI): TLS relies on certificates issued by trusted CAs to authenticate the identity of the server (and optionally the client). The public key contained in the certificate is used for the initial key exchange process.
  • Cipher Suites: A cipher suite is a set of algorithms that help secure a network connection. The set of algorithms that cipher suites usually contain include: a key exchange algorithm, a bulk encryption algorithm, and a message authentication code (MAC) algorithm. The bulk encryption algorithm is used to encrypt the data being sent. The MAC algorithm provides data integrity checks to ensure that the data sent does not change in transit. In addition, cipher suites can include signatures and an authentication algorithm to help authenticate the server and or client. Examples include TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384.
  • Handshaking Protocol: The handshake protocol is used to negotiate the security parameters of the connection and establish a shared secret.
  • Record Protocol: The record protocol is responsible for encapsulating and encrypting application data. It ensures the data is transmitted securely and in the correct order.

Key functions of TLS record protocol:

(Fragmentation) Breaks the data into smaller, manageable pieces (fragments).

(Compression (optional)) Compresses the data to reduce the amount of data transmitted (often not used in modern TLS implementations due to security concerns).

(Encryption) Encrypts the fragmented data using a symmetric encryption algorithm.

(Message Authentication) Ensures the integrity and authenticity of the message by adding a Message Authentication Code (MAC).

(Encapsulation) Combines the encrypted data and MAC into a record format, which is then transmitted. TLS may use different encryption modes such as CBC (Cipher Block Chaining) or GCM (Galois/Counter Mode). GCM is preferred in modern implementations because it provides both encryption and integrity protection (authenticated encryption).

Separate web application firewall (WAF) filters, monitors and block http traffic to and from web services. WAF is the application based firewall that protects web application by targeting http traffic where as standard firewall provides barrier between external and internal network traffic. More information about WAF in load balancer design section.

Securing File Data at Motion

Securing file data in motion (also known as data in transit) involves protecting data as it travels across networks or between systems, ensuring that it cannot be intercepted, read, or altered by unauthorized parties. Implement a checksum algorithm such as MD5 or SHA-256 to calculate a unique checksum value for each file before and after transfer. Compare the checksum values to verify the integrity of the transferred file.

Encrypting data in transit is the primary method of securing it.

  1. Using TLS/SSL (Transport Layer Security/Secure Sockets Layer).
  2. Using VPN (Virtual Private Network). VPNs encrypt the entire communication channel between two points, often used to secure connections over untrusted networks (like the internet). VPN internally uses the IPsec (Internet Protocol Security) or other protocols like vxlan and GRE, etc. more information you will find in networking section.
  3. Using SSH (Secure Shell). SSH encrypts the session between the client and server, protecting data in transit from being intercepted.

To ensure data has not been tampered with during transit, we can use MIC (Message Integrity Codes) generated by Hash-based Message Authentication Code (HMAC).

A Hash-based Message Authentication Code (HMAC) is a specific type of message authentication code (MAC) that involves a cryptographic hash function combined with a secret cryptographic key. HMAC is used to ensure the integrity and authenticity of a message, meaning it verifies that the message has not been altered in transit and confirms the identity of the sender. HMAC works by taking a message and a secret key, processing them through a cryptographic hash function like SHA-256, SHA-1, or MD5, and producing a fixed-size output known as the HMAC or MAC value.

Securing Data At rest

Securing file data involves protecting the confidentiality, integrity, and availability of data stored on a disk. We have separate encryption key for encrypt or decrypt data at rest.

Disk Encryption Key: Used to encrypt data on a hard drive. The entire disk or a partition is encrypted, making all files on it secure.

File Encryption Key: Individual files are encrypted.

The encryption algorithm is the same, either AES or RSA. AES is widely used for encrypting data at rest due to its efficiency and security. Tools like SHA256 or MD5 can generate a hash of a file’s contents. Any change in the file will result in a different hash value, alerting you to tampering.

There are two ways the disk data or data at rest can be encrypted.

Full Disk Encryption (FDE): Encrypts everything on the disk, including the operating system, system files, and user data. This is transparent to the user and ensures that all data is protected. It’s mostly a secure boot, where during the boot process, the encryption key is required to decrypt the operating system and other necessary files to start the system.

File-Level Encryption: Only specific files or directories are encrypted. This offers more granularity but may leave some parts of the disk unencrypted.

Disk encryption is a crucial security measure for protecting sensitive data at rest. By converting data into unreadable code, it ensures that even if a disk falls into the wrong hands, the information stored on it remains secure. Whether implemented through software or hardware, disk encryption is a fundamental component of modern data security strategies.

Similarly Storage devices can also need to be secured, with steps

  1. Configure the security group.
  2. Update the access controlthat define which user has access to stored data and keys, default bucket encryption, encryption status in inventory.

More information about security group can be found in separate blog with advanced topics.

More information about DNS Security (DNSSEC) here.
More information about Web Application Firewall can be found here.

More information about security vulnerability and it’s mitigations, here.

--

--

Santosh P.

A Developer | Aspiring Distributed Computing Expert | Leveraging Algorithms & Data Structures for Optimal Performance | Passionate Techie