Active Directory Setup with Kerberized Dataproc Cluster

Jordan Hambleton
Google Cloud - Community
12 min readFeb 16, 2021

In a prior post we provided a terraform module for deploying Dataproc clusters with Kerberos and trust with MIT KDC for managing user/service account principals. While MIT KDC provides the ability to quickly get running, it is more common in enterprises to have existing identity management system such as Active Directory that would be used for managing user and service account principals for authentication with Hadoop.

While we can automate this deployment with terraform, in this post we will manually walk through the below steps to familiarize ourselves with these concepts.

  1. Setup Active Directory Domain Services (Domain Controller)
  2. Create a Dataproc cluster with one-way trust to the AD DS
  3. Create a user with a password and a service account principal with a keytab (stored in Secret Manager) for authentication
  4. Run Hadoop command in Dataproc that demonstrates authentication

Important Note: This guide does not cover deployment best practices, including security, high availability, hybrid architectures, VPC, and other essentials which are necessary for production environments. See Best Practices for running Active Directory on Google Cloud and Deploying a fault-tolerant Microsoft Active Directory environment to get started.

Overview

The below diagram and configuration depict the deployment for this example. It encompasses a sandbox Active Directory Server with cross-realm trust with Dataproc as well as it utilizes Secrets Manager for storing keytabs for non-human accounts. We simplify our example by using a single GCP project for all components, but separate projects, private networks, and firewalls should be setup in a real world scenario.

The domain for AD and Dataproc realm:

Active Directory Domain: FOO.INTERNAL
Server Instance: active-directory-2016
Dataproc Cluster Realm: ANALYTICS.FOO.INTERNAL
Dataproc Cluster: analytics-cluster
[Dataproc one-way trust with AD Kerberos Authentication]

The above architecture incorporates the following key aspects:

  • Users / Service accounts in Active Directory Corporate Server
  • Dataproc service principals are managed by on-cluster KDC. No service principals are required to be created in AD.
  • One-way trust can be created in Active Directory prior to Dataproc cluster creation. This provides flexibility for ephemeral clusters to join AD dynamically.
  • Secrets manager for storing/rotating keytabs added by Active Directory Admin and only accessible by Analytics Dataproc Cluster Admin.

Prerequisites

The following prerequisites are required for setting up Windows and Dataproc GCE instances in our project. We won’t cover setting up the VPC and other aspects as mentioned earlier. The following steps can be executed in cloud shell or local terminal.

1. Setup environment / accounts (modify as needed):
Create your GCP project (ad-kerberos is used in our example), initialize your gcloud sdk, set the default region/zone, and let’s get started.

# properties
export PROJECT=$(gcloud info --format='value(config.project)')
export ZONE=$(gcloud info --format='value(config.properties.compute.zone)')
export REGION=${ZONE%-*}
export SUBNETWORK=default
export BUCKET_SECRETS=${PROJECT}-us-dataproc-secrets
# gcp admin / service accounts
export GCP_ADMIN=$(gcloud info --format='value(config.account)')
export SERVICE_ACCOUNT_AD=active-directory-sa
export SERVICE_ACCOUNT_DL=dataproc-analytics-cluster-sa
# create gcp service account for dataproc cluster
gcloud iam service-accounts create "${SERVICE_ACCOUNT_DL}" \
--description="dataproc sa" \
--display-name=${SERVICE_ACCOUNT_DL}
# create gcp service account for active directory gce instance
gcloud iam service-accounts create ${SERVICE_ACCOUNT_AD} \
--description="active directory sa" \
--display-name=${SERVICE_ACCOUNT_AD}
# enable private google access on subnet
gcloud compute networks subnets update ${SUBNETWORK} --region=${REGION} --enable-private-ip-google-access
# IAM role for IAP tunnel to AD and Dataproc (if needed)
gcloud projects add-iam-policy-binding ${PROJECT} \
--member=user:${GCP_ADMIN} \
--role=roles/iap.tunnelResourceAccessor
# IAM role for dataproc cluster
gcloud projects add-iam-policy-binding ${PROJECT} --member=serviceAccount:${SERVICE_ACCOUNT_DL}@${PROJECT}.iam.gserviceaccount.com --role=roles/dataproc.worker

2. Setup Cloud KMS, Secret Manager, and secrets
In the latter half of our example we will use Cloud KMS for Dataproc setup with Kerberos and Secret Manager for storing an Active Directory keytab for core-data-svc account. We set these up in the next steps for later use.

2.1. Create Cloud KMS Key Ring, Cloud KMS Key, and assign IAM Roles.

# properties
export KMS_KEY_RING_LOCATION=us
export KMS_KEY_RING_NAME=dataproc-key-ring
export KMS_KEY_NAME=dataproc-key
export KMS_KEY_URI=projects/${PROJECT}/locations/${KMS_KEY_RING_LOCATION}/keyRings/${KMS_KEY_RING_NAME}/cryptoKeys/${KMS_KEY_NAME}
# create kms keyring
gcloud kms keyrings create ${KMS_KEY_RING_NAME} \
--project ${PROJECT} \
--location ${KMS_KEY_RING_LOCATION}
# create kms key
gcloud kms keys create ${KMS_KEY_NAME} \
--project ${PROJECT} \
--location ${KMS_KEY_RING_LOCATION} \
--purpose encryption \
--keyring ${KMS_KEY_RING_NAME}
# allow dataproc decrypt IAM role on kms key for cluster creation
gcloud kms keys add-iam-policy-binding ${KMS_KEY_NAME} --location ${KMS_KEY_RING_LOCATION} --keyring ${KMS_KEY_RING_NAME} --member=serviceAccount:${SERVICE_ACCOUNT_DL}@${PROJECT}.iam.gserviceaccount.com --role=roles/cloudkms.cryptoKeyDecrypter
# allow AD decrypt IAM role on kms key for shared secret
gcloud kms keys add-iam-policy-binding ${KMS_KEY_NAME} --location ${KMS_KEY_RING_LOCATION} --keyring ${KMS_KEY_RING_NAME} --member=serviceAccount:${SERVICE_ACCOUNT_AD}@${PROJECT}.iam.gserviceaccount.com --role=roles/cloudkms.cryptoKeyDecrypter

2.2. Generate random secrets for the Dataproc root principal and the shared principal with Active Directory.

# create secret
openssl rand -base64 32 | sed 's/\///g' > cluster_secret.txt
openssl rand -base64 32 | sed 's/\///g' > trust_secret.txt
# create bucket in project for secrets (restrict access)
gsutil mb gs://$BUCKET_SECRETS
# encrypt cluster secret and store in restricted bucket
cat cluster_secret.txt \
| gcloud kms encrypt --ciphertext-file - --plaintext-file \
- --key "${KMS_KEY_URI}" | gsutil cp - gs://${BUCKET_SECRETS}/analytics-cluster_principal.encrypted
# encrypt trust secret and store in restricted bucket
cat trust_secret.txt | gcloud kms encrypt \
--ciphertext-file - --plaintext-file - --key "${KMS_KEY_URI}" \
| gsutil cp - gs://${BUCKET_SECRETS}/trust_analytics-cluster_ad_principal.encrypted
# allow active directory access to shared secrets
gsutil iam ch serviceAccount:${SERVICE_ACCOUNT_AD}@${PROJECT}.iam.gserviceaccount.com:objectViewer gs://${BUCKET_SECRETS}

2.3. Create Secret in Secret Manager and Assign IAM Roles for storing core-data-svc keytab.

# properties
export KEYTAB_SECRET=keytab-core-data-svc
# create secret manager secret for storing Kerberos keytab
gcloud secrets create ${KEYTAB_SECRET}
# role for AD host account secrets adder for keytab
gcloud secrets add-iam-policy-binding ${KEYTAB_SECRET} --member=serviceAccount:${SERVICE_ACCOUNT_AD}@${PROJECT}.iam.gserviceaccount.com --role=roles/secretmanager.secretVersionAdder
# role for dataproc host account secrets acccessor for keytab
gcloud secrets add-iam-policy-binding ${KEYTAB_SECRET} --member=serviceAccount:${SERVICE_ACCOUNT_DL}@${PROJECT}.iam.gserviceaccount.com --role=roles/secretmanager.secretAccessor

Setup Active Directory Domain Controller

In Google Cloud, Windows Server base images are available in Compute Engine. We will create the Domain Controller with the Windows Server 2016 Datacenter for our example and walk through a light weight installation. Refer to the Deploying a fault-tolerant Microsoft Active Directory environment guide for a more comprehensive setup that includes VPC, DNS, multi-zone deployment, etc.

Create Active Directory Instance. Navigate to Compute Engine > VM instances > Create Instance.

[Google Compute Engine instance with Windows Image]

Configure the Boot disk Operating system Windows Server and version Windows Server 2016 Datacenter.

[Windows Server 2016 Datacenter public image]

For the GCE instance, we will use the active-directory-sa service account that has specific IAM roles configured above.

[GCE Instance Service account]

Configure the Network interface with External IP to None to only use internal IP for the instance. If using internal IP only, refer to the steps later in this guide for configuring RDP through an IAP tunnel.

[Network Interface with No External IP]

After the server has finished initializing, Set Windows password.

[Windows Server instance active-directory-2016]

Set new password for user.

[Windows password reset prompt]

Login to Windows Server using Remote Desktop

This next section covers accessing the Windows Server using internal IP, but multiple options are available.

  • If external IP is configured (not recommended), then install the Chrome RDP for Google Cloud Platform extension, and an RDP session can be created directly from console through the active-directory-2016 instance drop-down.
  • If internal IP is configured, then setup IAP for TCP forwarding, create a tunnel, and use an RDP client (ie. Mac client) following the steps below.

Initialize Tunnel for RDP over IAP

gcloud compute start-iap-tunnel active-directory-2016 3389 --local-host-port=localhost:3389 --zone=us-central1-a
Testing if tunnel connection works.
Listening on port [3389].

Launch Microsoft Remote Desktop Client

Add Host localhost and connect using the username / password set above.

[Microsoft Remote Desktop for Mac]

Set Administrator Password

After connecting to Windows Server, set the Administrator password. This step is required for promoting server to the domain controller.

Navigate to user administration through Tools > Computer Management

[Active Directory Server Manager Computer Management]

Next, select Local Users and Groups > Users > Administrator > Right click and select Set Password

[Computer management set administrator password]

Setup Domain Services and Domain Controller

Select Add roles and features to being setup.

[Active Directory Server Manager Add roles and features]

Next, selectRole-based installation type.

[Add Roles and Features Wizard]

Add Active Directory Domain Services Role.

[Add Roles and Features Wizard]

Continue with defaults for Features and Active Directory Domain Services and then Install services.

[Add Roles and Features Wizard]

Next, Promote this server to a domain controller from Server Manager Dashboard.

[Promote server to domain controller]

Select Add a new forest and add the root domain. In our example, we use foo.internal.

[Active Directory Domain Services Configuration Wizard]

Set the password for the Restore Mode and continue.

[Active Directory Domain Services Configuration Wizard]

Continue with default FOO for NetBIOS domain name in DNS Options and and continue.

[Active Directory Domain Services Configuration Wizard]

Continue with install after review. Server will automatically restart to complete installation.

[Active Directory Domain Services Configuration Wizard]

Congratulations, the Domain Controller has now been installed. Next, we will use the encrypted secrets that we created in the prerequisite section for setting up trust from the Dataproc cluster KDC to Active Directory.

Add external KDC to Active Directory Server

Next, we setup the external Dataproc KDC realm, ANALYTICS.FOO.INTERNAL, in Active Directory for establishing the one-way trust even though we have not created the Dataproc cluster yet.

Open powershell (right click & run as administrator) and run the following:

# set powershell variables
$KMS_KEY_URI="projects/ad-kerberos/locations/us/keyRings/dataproc-key-ring/cryptoKeys/dataproc-key"
$BUCKET_SECRETS="ad-kerberos-us-dataproc-secrets"
# add external KDC realm to Active Directory
ksetup /addkdc ANALYTICS.FOO.INTERNAL analytics-cluster-m.us-central1-a.c.ad-kerberos.internal
# verify realm was added
PS C:\Users\jhambleton> ksetup
default realm = foo.internal (NT Domain)
ANALYTICS.FOO.INTERNAL:
kdc = analytics-cluster-m.us-central1-a.c.ad-kerberos.internal
# retrieve trust secret
gsutil cp gs://$BUCKET_SECRETS/trust_analytics-cluster_ad_principal.encrypted .
$secret=gcloud kms decrypt --ciphertext-file .\trust_analytics-cluster_ad_principal.encrypted --plaintext-file - --key "${KMS_KEY_URI}"
# create one-way trust
netdom trust ANALYTICS.FOO.INTERNAL /Domain FOO.INTERNAL /add /realm /passwordt:$secret
# Configure other domain support for Kerberos AES Encryption
ksetup /setenctypeattr ANALYTICS.FOO.INTERNAL AES256-CTS-HMAC-SHA1-96 AES128-CTS-HMAC-SHA1-96

Create Dataproc cluster with one-way trust to AD Domain Controller

In your local terminal or cloud shell, create a YAML file for the Dataproc Kerberos configurations (ensure these configurations are correctly defined).

#####
# kerberos_dataproc.yaml config
#
enable_kerberos: true
realm: ANALYTICS.FOO.INTERNAL
root_principal_password_uri:
gs://ad-kerberos-us-dataproc-secrets/analytics-cluster_principal.encrypted
kms_key_uri:
projects/ad-kerberos/locations/us/keyRings/dataproc-key-ring/cryptoKeys/dataproc-key
cross_realm_trust:
realm: FOO.INTERNAL
kdc: active-directory-2016.us-central1-a.c.ad-kerberos.internal
admin_server: active-directory-2016.us-central1-a.c.ad-kerberos.internal
shared_password_uri:
gs://ad-kerberos-us-dataproc-secrets/trust_analytics-cluster_ad_principal.encrypted
tgt_lifetime_hours: 1

Create Dataproc Cluster.

gcloud dataproc clusters create analytics-cluster \
--enable-component-gateway \
--no-address \
--region ${REGION} \
--zone ${ZONE} \
--subnet default \
--image-version 1.5-debian \
--service-account ${SERVICE_ACCOUNT_DL}@${PROJECT}.iam.gserviceaccount.com \
--scopes 'https://www.googleapis.com/auth/cloud-platform' \
--project ${PROJECT} \
--kerberos-config-file ./kerberos_dataproc.yaml

Create Alice and core-data-svc service principal

Next, let’s create our first test user Alice in Active Directory. Navigate to
Tools > Active Directory Users and Computers.

[Active Directory Users and Computers]

Create New Organizational Unit.

[New Organizational Unit]

Create GCP Data Lake OU and Users OU within it.

[New Organizational Unit — GCP Data Lake OU]

Users in the GCP Data Lake OU will contain special user accounts specific to the data lake only (ie. AD service accounts for running scheduled jobs).

[GCP Data Lake OU]

Create new user under top-level Users.

[Create New User]

Add Alice as the new user (we set insecure pwd alic123!).

[New User — alice@FOO.INTERNAL]

Next, we create core-data-svc service account in the GCP Data Lake OU and generate a keytab for authentication.

[New account for core-data-svc in GCP Data Lake OU]
[New object for core-data-svc]

Create the keytab for this non-human core-data-svc account. Open powershell (right click & run as administrator) and run the following:

$KEYTAB_SECRET="keytab-core-data-svc"# keytab generation 
ktpass /out "C:\Users\jhambleton\core-data-svc.keytab" /mapuser core-data-svc /princ core-data-svc@FOO.INTERNAL +rndpass /ptype KRB5_NT_PRINCIPAL /target FOO.INTERNAL /kvno 0 /crypto AES256-SHA1
# export keytab to secret manager
gcloud secrets versions add ${KEYTAB_SECRET} --data-file="C:\Users\jhambleton\core-data-svc.keytab"
# remove secret
rm "C:\Users\jhambleton\core-data-svc.keytab"

Verify Kerberos Authentication with Active Directory

Login, authenticate as alice, and run Hadoop commands on analytics-cluster.

$ gcloud compute ssh alice@analytics-cluster-m --tunnel-through-iap
$ kinit alice@FOO.INTERNAL # remember insecure pwd alic123!
Password for alice@FOO.INTERNAL:
$ klist
$ hadoop fs -ls /user/

Application Service Account Test core-data-svc@FOO.INTERNAL

The below steps kinit using the keytab and then destroy the local keytab for safety reasons. Then it run’s the hadoop command (or can launch a spark job) on the analytics cluster. The command can only be executed with a Google Identity with authorization to the cluster instance.

Run from local terminal or cloud shell (ssh isn’t required):

# properties
export KEYTAB_SECRET=keytab-core-data-svc
export AD_SVC_ACCNT=core-data-svc
export DOMAIN=FOO.INTERNAL
## ## ## ##
## commands to be executed on remote cluster
# cmd 1 - get keytab from secret manager
export cmd_get_secret="gcloud secrets versions access latest --secret ${KEYTAB_SECRET} --format='get(payload.data)' | tr '_-' '/+' | base64 -d > ./${AD_SVC_ACCNT}.keytab"
# cmd 2 - kinit
export cmd_kinit="kinit -kt ./${AD_SVC_ACCNT}.keytab ${AD_SVC_ACCNT}@${DOMAIN}; rm ./${AD_SVC_ACCNT}.keytab"
# cmd 3 - execute hadoop cmd or spark-submit
export cmd_hadoop="hadoop fs -ls /user/"
## ## ## ##
# execute cmd - run hadoop/spark-submit command with kerberos auth
gcloud compute ssh core-data-svc@analytics-cluster-m \
--command="${cmd_get_secret}; ${cmd_kinit}; ${cmd_hadoop};" --tunnel-through-iap

Troubleshooting

In this section we capture a few useful commands that are useful with debugging if it’s necessary.

Command to test the username default rule (hadoop.security.auth_to_local rules)

$ hadoop org.apache.hadoop.security.HadoopKerberosName alice@FOO.INTERNAL
Name: alice@FOO.INTERNAL to alice

Debug mode for Hadoop and Kerberos Handshake

$ HADOOP_OPTS="-Dsun.security.krb5.debug=true" hadoop fs -ls /
...
>>> KrbKdcReq send: kdc=active-directory-2016.us-central1-a.c.ad-kerberos.internal UDP:88, timeout=30000, number of retries =3, #bytes=1378
>>> KDCCommunication: kdc=active-directory-2016.us-central1-a.c.ad-kerberos.internal UDP:88, timeout=30000,Attempt =1, #bytes=1378
>>> KrbKdcReq send: #bytes read=1377
>>> KdcAccessibility: remove active-directory-2016.us-central1-a.c.ad-kerberos.internal
>>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType
>>> TGS credentials serviceCredsSingle:
...

Enable trace level Hadoop logger and Debug JAAS logging

$ HADOOP_ROOT_LOGGER=TRACE,console HADOOP_JAAS_DEBUG=true hdfs dfs -ls /
...
21/02/05 12:32:54 DEBUG security.UserGroupInformation: hadoop login commit
21/02/05 12:32:54 DEBUG security.UserGroupInformation: using kerberos user:alice@FOO.INTERNAL
21/02/05 12:32:54 DEBUG security.UserGroupInformation: Using user: "alice@FOO.INTERNAL" with name alice@FOO.INTERNAL
...

Kerberos level trace

$ env KRB5_TRACE=/dev/stdout kinit alice@FOO.INTERNAL
[14178] 1612528239.343161: Getting initial credentials for alice@FOO.INTERNAL
[14178] 1612528239.343163: Sending unauthenticated request
[14178] 1612528239.343164: Sending request (192 bytes) to FOO.INTERNAL
[14178] 1612528239.343165: Resolving hostname active-directory-2016.us-central1-a.c.ad-kerberos.internal
[14178] 1612528239.343166: Sending initial UDP request to dgram 10.128.0.55:88
...

Summary

In this session, we walked through setting up Active Directory Domain Services and Domain Controller as an identity provider for authentication with a Kerberized Dataproc cluster. While this setup was a simple deployment with a single Dataproc cluster, it can be expanded on to incorporate complex Data Lake Architectures on Google Cloud.

Addtional References

By Wilson Liu & Jordan Hambleton

--

--