Spark on K8s — Perform a Spark-Submit to Amazon EKS Cluster With IRSA

Published in

The Startup

7 min readAug 14, 2020

In previous article, I have introduced how we submit a Spark job to an EKS cluster. As long as we’re using other AWS components for our pipelines to interact, like S3/DynamoDB/etc., we need assign IAM policy to the Spark driver and executor pods.
In this tutorial, I will show you how-to submit a Spark job version 2.4.4 with IRSA (IAM roles for Service Account).

What is IRSA?

According to AWS official documentation and blog:

Our approach, IAM Roles for Service Accounts (IRSA), however, is different: we made pods first class citizens in IAM. Rather than intercepting the requests to the EC2 metadata API to perform a call to the STS API to retrieve temporary credentials, we made changes in the AWS identity APIs to recognize Kubernetes pods. By combining an OpenID Connect (OIDC) identity provider and Kubernetes service account annotations, you can now use IAM roles at the pod level.
With IAM roles for service accounts on Amazon EKS clusters, you can associate an IAM role with a Kubernetes service account. This service account can then provide AWS permissions to the containers in any pod that uses that service account. With this feature, you no longer need to provide extended permissions to the node IAM role so that pods on that node can call AWS APIs.

AWS also mentions following benefits when combining IRSA with other community tools like kiam or kube2iam:

Least privilege — By using the IAM roles for service accounts feature, you no longer need to provide extended permissions to the node IAM role so that pods on that node can call AWS APIs. You can scope IAM permissions to a service account, and only pods that use that service account have access to those permissions. This feature also eliminates the need for third-party solutions such as kiam or kube2iam.
Credential isolation — A container can only retrieve credentials for the IAM role that is associated with the service account to which it belongs. A container never has access to credentials that are intended for another container that belongs to another pod.
Auditability — Access and event logging is available through CloudTrail to help ensure retrospective auditing.

Why spark-submit with IRSA?

We are actually running Spark jobs using kube2iam annotations, and we got a lots of random Spark jobs failure due to the throttling of API calls to EC2 metadata:

Exception in thread "main" org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist on sample-s3-bucket: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/
 at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:144)
 at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:328)
 at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:270)
 at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3242)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:121)
 ... 35 more
Caused by: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/
 at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:151)
 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1166)
 ... 27 more
Caused by: com.amazonaws.SdkClientException: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/
 at com.amazonaws.internal.EC2CredentialsUtils.readResource(EC2CredentialsUtils.java:125)
 at com.amazonaws.internal.EC2CredentialsUtils.readResource(EC2CredentialsUtils.java:87)
 at com.amazonaws.auth.InstanceProfileCredentialsProvider$InstanceMetadataCredentialsEndpointProvider.getCredentialsEndpoint(InstanceProfileCredentialsProvider.java:189)
 at com.amazonaws.auth.EC2CredentialsFetcher.fetchCredentials(EC2CredentialsFetcher.java:122)
 at com.amazonaws.auth.EC2CredentialsFetcher.getCredentials(EC2CredentialsFetcher.java:82)
 at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:164)
 at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:129)
 ... 43 more

Beside that, we want to explore another way to isolate the IAM permission for different Spark jobs from K8s pod and serviceAccount/namespace layer, and remove one component (kube2iam) from our K8s setup.

Requirements:

EKS cluster version 1.14 or above.
Administrative permission on a running AWS EKS cluster.
Administrative permission to create or update IAM roles/policies in AWS account.
awscli at least version 1.18.110 or 2.0.36.
kubectl to manage K8s cluster.

Below is the general information of my EKS cluster (IP addresses are fake ones):

➜ kubectl cluster-info
Kubernetes master is running at https://4A5<i_am_tu>545E6.sk1.ap-southeast-1.eks.amazonaws.com
CoreDNS is running at https://4A5<i_am_tu>545E6.sk1.ap-southeast-1.eks.amazonaws.com/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy➜ ~ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10–0–0–3.ap-southeast-1.compute.internal Ready <none> 134d v1.15.11
ip-10–0–0–5.ap-southeast-1.compute.internal Ready <none> 91d v1.15.11
ip-10–0–0–7.ap-southeast-1.compute.internal Ready <none> 134d v1.15.11

Create IAM role with OIDC enabled

Create IAM OIDC provider

Get the OIDC issuer URL of your EKS cluster:

➜ export OIDC_ISSUER_URL=$(aws eks describe-cluster --name <your_eks_cluster_name> --query "cluster.identity.oidc.issuer" --output text)

Get the thumbprint of EKS in your AWS region:

➜ export THUMBPRINT=$(echo | openssl s_client -servername oidc.eks.ap-southeast-1.amazonaws.com -showcerts -connect oidc.eks.ap-southeast-1.amazonaws.com:443 2>&- | tac | sed -n '/-----END CERTIFICATE-----/,/-----BEGIN CERTIFICATE-----/p; /-----BEGIN CERTIFICATE-----/q' | tac | openssl x509 -fingerprint -noout | sed 's/://g' | awk -F= '{print tolower($2)}')

Create IAM OIDC provider:

➜  aws iam create-open-id-connect-provider --url $OIDC_ISSUER_URL --client-id-list "sts.amazonaws.com" --thumbprint-list $THUMBPRINT

The IAM OIDC provider will be created with ARN looks like: arn:aws:iam::<aws_account_id>:oidc-provider/oidc.eks.ap-southeast-1.amazonaws.com/id/4A5<i_am_tu>545E6

Create IAM role and policy

Generate Trust Relationship policy for your IAM role:

➜ export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)➜ export OIDC_PROVIDER=$(aws eks describe-cluster --name <your_eks_cluster_name> --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///")➜ export NAMESPACE="spark-pi"➜ export SERVICE_ACCOUNT="spark-pi"➜ cat <<EOF > ./trust.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "${OIDC_PROVIDER}:sub": "system:serviceaccount:${NAMESPACE}:${SERVICE_ACCOUNT}"
        }
      }
    }
  ]
}
EOF

Create IAM role with relevant policies:

➜ aws iam create-role --role-name spark-irsa-test-role --assume-role-policy-document file://trust.json --description "IAM role to test spark-submit with IRSA"
➜ aws iam attach-role-policy --role-name spark-irsa-test-role --policy-arn=<IAM_POLICY_ARN>

Create spark-pi RBAC

Create a file named spark_role.yaml as below, remember to update the serviceAccount annotations with the IAM role ARN created in previous step:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: spark-pi
  namespace: spark-pi
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<aws_account_id>:role/spark-irsa-test-role
automountServiceAccountToken: true
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: spark-pi-role
  namespace: spark-pi
rules:
- apiGroups: [""]
  resources: ["pods", "services", "configmaps"]
  verbs: ["get", "list", "watch", "create", "delete", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: spark-pi-role-binding
  namespace: spark-pi
subjects:
- kind: ServiceAccount
  name: spark-pi
  namespace: spark-pi
roleRef:
  kind: Role
  name: spark-pi-role
  apiGroup: rbac.authorization.k8s.io

Run kubectl to create namespace and service account

➜  kubectl create namespace spark-pi
namespace/spark-pi created
➜  kubectl apply -f ~/lab/k8s/spark_role.yml
serviceaccount/spark-pi created
role.rbac.authorization.k8s.io/spark-pi-role created
rolebinding.rbac.authorization.k8s.io/spark-pi-role-binding created

Verify whether new service account has permission to create/delete pods

➜  kubectl auth can-i create pod --as=system:serviceaccount:spark-pi:spark-pi -n spark-pi
yes

Build spark-2.4.4 Docker image with IRSA patches

According to ticket SPARK-27872, in Spark 2.4.4, spark-submit only supports one parameter spark.kubernetes.authenticate.driver.serviceAccountName=<service_account_name> to assign serviceAccount to Spark Driver pod, while Spark Executor pods will be using default serviceAccount in the namespace, which causing Executor don’t have necessary permissions to access AWS resources.

There was a pull request which add one more parameter spark.kubernetes.authenticate.executor.serviceAccountName but the change only affect for Spark 3.x.

While most of our pipelines are still using Spark 2.4.4 and not easy to upgrade all of them to use Spark 3.x, so I decided to tweak the codes and back-port it to Spark 2.4.4.

Download my patch file and save it as spark-2.4.4-irsa.patch:

I had tweak my script in previous post to apply the patch to Spark Docker image:

I’m using docker:dind image to run and build Docker image inside a Docker container

➜  docker container run \
    --privileged -it \
    --name spark-build \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v ${PWD}:/tmp \
    -e USER=<your_docker_username> \
    -e PASSWORD=<your_docker_password> \
    -w /opt \
    docker:dind \
    sh /tmp/spark_docker_build_with_irsa_patch.sh

You can use the Docker image I’ve built from above script with tag vitamingaugau/spark:spark-2.4.4-irsa.

Perform sample spark-submit

Create a jump-pod to run spark-submit:

➜  cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  labels:
    run: tmp
  name: tmp
  namespace: spark-pi
spec:
  containers:
  - image: vitamingaugau/spark:spark-2.4.4-irsa
    imagePullPolicy: Always
    name: tmp
    args:
    - sleep
    - "1000000"
    resources: {}
  serviceAccountName: spark-pi
EOF
➜  kubectl exec -it tmp -n spark-pi -- bash

Run spark-submit command inside jump pod:

/opt/spark/bin/spark-submit \
    --master=k8s://https://4A5<i_am_tu>545E6.sk1.ap-southeast-1.eks.amazonaws.com \
    --deploy-mode cluster \
    --name spark-pi \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.kubernetes.driver.pod.name=spark-pi-driver \
    --conf spark.kubernetes.container.image=vitamingaugau/spark:spark-2.4.4-irsa \
    --conf spark.kubernetes.namespace=spark-pi \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-pi \
    --conf spark.kubernetes.authenticate.executor.serviceAccountName=spark-pi \
    --conf spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.WebIdentityTokenCredentialsProvider \
    --conf spark.kubernetes.authenticate.submission.caCertFile=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt \
    --conf spark.kubernetes.authenticate.submission.oauthTokenFile=/var/run/secrets/kubernetes.io/serviceaccount/token \
    local:///opt/spark/examples/target/scala-2.11/jars/spark-examples_2.11-2.4.4.jar 20000

Noted: as you see above command, we need following parameters for IRSA works:

--conf spark.kubernetes.namespace=spark-pi
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-pi
--conf spark.kubernetes.authenticate.executor.serviceAccountName=spark-pi
--conf spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.WebIdentityTokenCredentialsProvider

After run spark-submit command, verify Spark Driver and Executor pods have correct serviceAccount defined:

➜  SPARK_APP=$(kubectl get pods -o jsonpath='{.metadata.labels.spark-app-selector}' -n spark-pi spark-pi-driver)➜  kubectl get pods -l spark-app-selector=$SPARK_APP -n spark-pi
NAME                            READY   STATUS    RESTARTS   AGE
spark-pi-1597431578845-exec-1   1/1     Running   0          11s
spark-pi-1597431578845-exec-2   1/1     Running   0          11s
spark-pi-driver                 1/1     Running   0          17s➜  kubectl get pods -o jsonpath='{.spec.serviceAccount}' spark-pi-driver -n spark-pi
spark-pi➜  kubectl get pods -o jsonpath='{.spec.serviceAccount}' spark-pi-1597431578845-exec-1 -n spark-pi
spark-pi

Also checking the environment of those pods, you will see the environment variable AWS_ROLE_ARN value:arn:aws:iam::<aws_account_id>:role/spark-irsa-test-role, which was added as spark-pi serviceAccount’s annotations:

➜  kubectl get pods -o jsonpath='{.spec.containers[0].env}' spark-pi-driver -n spark-pi
[map[name:SPARK_DRIVER_BIND_ADDRESS valueFrom:map[fieldRef:map[apiVersion:v1 fieldPath:status.podIP]]] map[name:SPARK_LOCAL_DIRS value:/var/data/spark-d90a2560-dbe3-4cb5-bc1f-8a0bba614b55] map[name:SPARK_CONF_DIR value:/opt/spark/conf] map[name:AWS_ROLE_ARN value:arn:aws:iam::<aws_account_id>:role/spark-irsa-test-role] map[name:AWS_WEB_IDENTITY_TOKEN_FILE value:/var/run/secrets/eks.amazonaws.com/serviceaccount/token]]➜  kubectl get pods -o jsonpath='{.spec.containers[0].env}' spark-pi-1597431578845-exec-1 -n spark-pi
[map[name:SPARK_DRIVER_URL value:spark://CoarseGrainedScheduler@spark-pi-1597431578845-driver-svc.spark-pi.svc:7078] map[name:SPARK_EXECUTOR_CORES value:1] map[name:SPARK_EXECUTOR_MEMORY value:1g] map[name:SPARK_APPLICATION_ID value:spark-439e399c63e242dcb995c4ec0384ab36] map[name:SPARK_CONF_DIR value:/opt/spark/conf] map[name:SPARK_EXECUTOR_ID value:1] map[name:SPARK_EXECUTOR_POD_IP valueFrom:map[fieldRef:map[apiVersion:v1 fieldPath:status.podIP]]] map[name:SPARK_LOCAL_DIRS value:/var/data/spark-d90a2560-dbe3-4cb5-bc1f-8a0bba614b55] map[name:AWS_ROLE_ARN value:arn:aws:iam::<aws_account_id>:role/spark-irsa-test-role] map[name:AWS_WEB_IDENTITY_TOKEN_FILE value:/var/run/secrets/eks.amazonaws.com/serviceaccount/token]]

Now you can configure your Spark jobs to interact with AWS resources like S3 or DynamoDB with proper IAM policies. You can also restrict the IAM role to specific serviceAccount in specific namespace using OIDC and IRSA :D.

Oct 14th 2020 Update: There is a pull request has back-ported the changes in #24748 to Spark 2.4.x, so you can use latest release of Spark 2.4 without applying my patch file.

Feel free to let me know if you’re facing any issue when following this tutorial.

Checkout my tutorials for Spark on K8s series:

How-to run Spark job on EKS cluster

In this tutorial, I will show how-to running a sample Spark job on EKS cluster.

tunguyen9889.medium.com

Spark on K8s — Send Spark job’s metrics to DataDog using Autodiscovery