Apache Kafka Series [Part 3]: Securing our Kafka cluster
In the last article, we looked at some key concepts of Apache Kafka. We learned about partitioning and offset management in Apache Kafka. If you’re not familiar with these concepts you can just go and take a look at the previous article.
In the last part of this series, we’ll see what are our options for security in Apache Kafka. We’ll also look into how can we decide when to use which security option and why. This article mainly focuses on some specific and widely used security options from AWS MSK’s point of view.
There are three parts to Kafka Security:
- SSL/TLS encryption: This encrypts data between your clients and Kafka, as well as broker-to-broker communication. This is a very frequent pattern that everyone has encountered while surfing the internet.
- Authentication with SSL, SASL, or IAM (MSK-specific): This enables your producers and consumers to authenticate with your Kafka cluster, confirming their identity. It’s also a safe approach for your customers to confirm their identification.
- Authorization using ACLs or IAM Policies (MSK specific): Once your clients have been authenticated, your Kafka brokers can compare them to access control lists (ACLs) or IAM policies for IAM authenticated users to see if they are allowed to write or read to a given topic.
Encryption
Kafka uses TLS encryption which is the most common encryption used and the widely used example is HTTPS over web.
Because there are few encryption options in Kafka, the standard configuration is used, which consists of declaring truststore and keystore with their corresponding keys in server.properties and client.properties, respectively.
Example config looks like the following:
#security.protocol=SSL
#ssl.truststore.location=/tmp/kafka_2.12/kafka.client.truststore.jks
#ssl.keystore.location=/tmp/kafka_2.12/kafka.client.keystore.jks
#ssl.keystore.password=Your-Store-Pass
#ssl.key.password=Your-Key-Pass
MSK specific
You can use Amazon MSK’s data encryption features to meet strict data management needs. The encryption certificates used by Amazon MSK must be refreshed every 13 months. For all clusters, Amazon MSK renews these certifications automatically. When it starts the certificate-update operation, it sets the cluster’s state to MAINTENANCE. When the update is finished, it returns it to ACTIVE. You can continue to produce and consume data while a cluster is in the MAINTENANCE state, but you won’t be able to update it.
This can be configured with the following section during cluster creation:
- Encryption at Rest: AWS Key Management Service (KMS) interfaces with Amazon MSK to provide transparent server-side encryption. At rest, Amazon MSK encrypts all of your data. You can select the AWS KMS customer master key (CMK) that Amazon MSK will use to encrypt your data at rest when you construct an MSK cluster. If you don’t provide a CMK, Amazon MSK creates and uses an AWS managed CMK on your behalf. For more information about CMKs, see Customer Master Keys (CMKs) in the AWS Key Management Service Developer Guide.
- Encryption in Transit: TLS 1.2 is now used by Amazon MSK. It encrypts data in transit between the brokers in your MSK cluster by default. This default can be overridden when the cluster is created.
You must choose one of the three options for communication between clients and brokers: - Only TLS-encrypted data is permitted. This is the default configuration.
- Accept both plaintext and TLS-encrypted data.
- Only plaintext data is permitted.
AWS Certificate Manager certificates are used by Amazon MSK brokers. As a result, every truststore that trusts Amazon Trust Services also trusts Amazon MSK brokers’ certificates.
An important point to note here would be that enabling encryption reduces performance by approximately 30%. However, the exact percentage depends on the configuration of your cluster and clients.
Authentication
There are three primary ways to implement authentication in Kafka:
Kafka Native:
- Mutual TLS Authentication: In mutual TLS authentication the server also verifies the client in its truststore. It works in the following way:
The configuration for this is done as following on both the server and the client side.
#security.protocol=SSL
#ssl.truststore.location=/tmp/kafka_2.12/kafka.client.truststore.jks
#ssl.keystore.location=/tmp/kafka_2.12/kafka.client.keystore.jks
#ssl.keystore.password=Your-Store-Pass
#ssl.key.password=Your-Key-Pass
The same is done in MSK using the following config and providing a certificate authority (CA).
Note: Using TLS authentication is not suggested as maintaining user roles would be troublesome in this case since the access rules are stored against CN (common name) of the provided certificate.
2. SASL: There are many ways which can be used with SASL but the most common one is SCRAM and we are going to look at the same since SCRAM is the also provided in MSK. In SASL/SCRAM (Simple Authentication and Security Layer/ Salted Challenge Response Mechanism) authentication, user credentials are used to verify client’s authenticity. These credentials are scrambled and stored in zookeeper itself. The configuration for this looks like following:
sasl.mechanism=SCRAM-SHA-512
# Configure SASL_SSL if SSL encryption is enabled, otherwise configure SASL_PLAINTEXT
security.protocol=SASL_SSL
sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
username=”kafkaclient1" \
password=”kafkaclient1-secret”;
You can add the users in the following way:
bin/kafka-configs — zookeeper localhost:2181 — alter — add-config ‘SCRAM-SHA-256=[iterations=8192,password=alice-secret],SCRAM-SHA-512=[password=alice-secret]’ — entity-type users — entity-name alice
In AWS MSK, the users are managed via AWS Secrets, the name of the secret must be in the format AmazonMSK_*. (Where wildcard denotes your secret name).
The credentials are stored in json format as following:
{
"username": "alice",
"password": "alice-secret"
}
These secrets are then linked to a cluster using AWS web’s cli or cluster console. The client configuration remains the same as before. There are different methods that can be utilized with SASL, but SCRAM is the most prevalent. You can read about more methods here.
Note: This method is much better as compared to TLS authentication as it supports encryption as well as easily manageable user roles. But only if your zookeeper is secure on a separate VPC or by other means.
MSK Specific:
3. IAM: IAM access control for Amazon MSK allows you to manage your MSK cluster’s authentication and authorization. This eliminates the requirement for two separate authentication and authorization mechanisms. When a client wants to write to your cluster, for example, Amazon MSK checks whether the client is an authenticated identity and is authorised to write to your cluster using IAM.
Clients for this type of authentication can be configured in the following way:
#ssl.truststore.location=<PATH_TO_TRUST_STORE_FILE>
#security.protocol=SASL_SSL
#sasl.mechanism=AWS_MSK_IAM
#sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required awsProfileName=“your profile name”;
#sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
The callback handler and the login module which is provided in the config is a part of a jar that AWS provides for IAM authentication. The following jar is needed to be added to the classpath or as a dependency so that these classes could be found at runtime.
https://github.com/aws/aws-msk-iam-auth/releases/download/1.0.0/aws-msk-iam-auth-1.0.0-all.jar
By default, IAM users and roles do not have permission to execute Amazon MSK API actions. An IAM administrator must create IAM policies that grant users and roles permission to perform specific API operations on the specified resources they need. The administrator must then attach those policies to the IAM users or groups that require those permissions.
Note: This method is preferred if your implementation of Kafka is on AWS since it ensures everything can be controlled by IAM policies itself and they are easy to configure.
Authorization:
ACL (Access Control List): ACLs are used to restrict access to different operations that an authenticated user can perform. ACL can only be used with Kafka native authentication options.
Apache Kafka provides a pluggable authorizer and comes with an out-of-the-box solution that stores all ACLs in Apache ZooKeeper. This authorizer is enabled by Amazon MSK in the server.properties file on the brokers. AclAuthorizer is the authorizer for Apache Kafka version 2.4.1. SimpleAclAuthorizer is the default for earlier versions of Apache Kafka..
Apache Kafka ACLs have the following format.
Principal P is [Allowed/Denied] Operation O From Host H on any Resource R matching ResourcePattern RP
If RP does not match a specific resource R, then R has no associated ACLs, and therefore no one other than super users is allowed to access R.
Following command is an example of how these rules are created using kafka-acls:
$ kafka-acls.sh --authorizer-properties zookeeper.connect=z-1.demo-cluster1.0dgmkx.c3.kafka.us-east-1.amazonaws.com:2181,z-3.demo-cluster1.0dgmkx.c3.kafka.us-east-1.amazonaws.com:2181,z-2.demo-cluster1.0dgmkx.c3.kafka.us-east-1.amazonaws.com:2181 --add --allow-principal "User:bob" --operation Read --topic AnyTopic
When unauthorized resource tries to access anything an error like the one below is thrown:
More information about ACLs can be found here.
IAM Policies (MSK specific):IAM Policies are used instead of ACLs for IAM authentication. You describe which actions to allow or reject for the role in an authorization policy. If your client is using an Amazon EC2 instance, link the authorization policy to the Amazon EC2 instance’s IAM role. Alternatively, you can set up your client to utilize a specified profile, and then link the authorization policy to that named profile.
These policies are controlled with Action and Resources elements.
A MSK policy looks something like shown below:
These policies can be added using web-based checkboxes or editing the policy json itself. The format of MSK based policy is as shown below:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"kafka-cluster:Connect",
"kafka-cluster:AlterCluster",
"kafka-cluster:DescribeCluster"
],
"Resource": [
"arn:aws:kafka:us-east-1:0123456789012:cluster/MyTestCluster/abcd1234-0123-abcd-5678-1234abcd-1"
]
},
{
"Effect": "Allow",
"Action": [
"kafka-cluster:*Topic*",
"kafka-cluster:WriteData",
"kafka-cluster:ReadData"
],
"Resource": [
"arn:aws:kafka:us-east-1:0123456789012:topic/MyTestCluster/*"
]
},
{
"Effect": "Allow",
"Action": [
"kafka-cluster:AlterGroup",
"kafka-cluster:DescribeGroup"
],
"Resource": [
"arn:aws:kafka:us-east-1:0123456789012:group/MyTestCluster/*"
]
}
]
}
Conclusion
In this article, we looked at how we can secure our Kafka cluster with the most commonly and widely used methods and what are our best choices according to our requirement.
With this article, this series is complete now. You now have a deep understanding of Apache Kafka and can use the gained knowledge to start with it from a very basic level to advanced level usage.
Thank you for enduring this long series with me ;)
Good luck and adios!