Running Jobs in AKS with Virtual Kubelet

Most Kubernetes sample applications demonstrate how to run web applications or databases in the orchestrator. Those workloads run until terminated.

However, Kubernetes is also able to run jobs, or in other words, a container that runs for a finite amount of time.

Jobs in Kubernetes can be classified in two:

The repository located at https://github.com/fbeltrao/aksjobscheduler contains a demo application that schedules Run to Completion jobs in a Kubernetes cluster taking advantage of Virtual Kubelet in order to preserved allocated resources. It has an option to integration with Event Grid to notify once a job has been completed.

Company is using a Kubernetes cluster to host a few applications. Cluster resource utilisation is high and team wants to avoid oversizing. One of the teams has new requirement to run jobs with unpredictable loads. In high peaks, the required compute resources will exceed what is available in the cluster.

A solution to this problem is to leverage Virtual Kubelet, scheduling jobs outside the cluster in case workload would starve available resources.

In my experience running the sample application using Virtual Kubelet had the following results:

Disclaimer

There are other options to run jobs in Azure (i.e. Azure Functions, Azure Batch, WebJobs). The option presented here might suit better a team that wishes to leverage containers and orchestrators experience.

Running containers in Virtual Kubelet in Azure

Virtual Kubelet is an open source project allowing Kubernetes to schedule workloads outside of the reserved cluster nodes.

For more information refer to documentation:

Virtual Kubelet pulling private images from ACR

According to the documentation it should be possible to set a service principal with access to ACR. However, it does not seem to work. You will see the following message on the job pod:

Status:             Pending
Reason: ProviderFailed
Message: api call to https://management.azure.com/subscriptions/xxxxxx/resourceGroups/xxxxx/providers/Microsoft.ContainerInstance/containerGroups/xxxxxx?api-version=2018-09-01: got HTTP response status code 400 error code "InaccessibleImage": The image 'xxxxx.azurecr.io/aksjobscheduler-worker-dotnet:1.0' in container group 'xxxx' is not accessible. Please check the image and registry credential.

The workaround describe in the GitHub issue requires us to create a secret and pass it to the pullImageSecrets Creating the secret is documented here:

$ kubectl create secret docker-registry my-acr-auth --docker-server xxx.azurecr.io --docker-username "<service principal id>" --docker-password "<service principal password>" --docker-email "<email address>"

Additionally, deploy the Jobs API with the correct JOBIMAGEPULLSECRET

Running jobs in Azure Container Instance

Running a job in Kubernetes is similar to deploying a web application. The difference is how we define the YAML file.

Take for example the Kubernetes demo job defined by the YAML below:

apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
backoffLimit: 4

Running the job in Kubernetes with kubectl is demonstrated below:

$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/website/master/content/en/examples/controllers/job.yaml

Viewing running jobs

$ kubectl get jobs
NAME DESIRED SUCCESSFUL AGE
pi 1 1 15s

Viewing the job result:

# find first pod where the job name is "pi", then list the logs
POD_NAME=$(kubectl get pods -l "job-name=pi" -o jsonpath='{.items[0].metadata.name}') && clear && kubectl logs $POD_NAME
3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679821480865132823066470938446095505822317253594081284811174502841027019385 REMOVED-TO-SAVE-SPACE

To run the same job using Virtual Kubelet add the pod selection/affinity information as the yaml below:

apiVersion: batch/v1
kind: Job
metadata:
name: aci-pi
spec:
template:
spec:
containers:
- name: aci-pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
resources:
requests:
cpu: "1"
memory: "1Gi"
restartPolicy: Never
nodeSelector:
beta.kubernetes.io/os: linux
kubernetes.io/role: agent
type: virtual-kubelet
tolerations:
- key: virtual-kubelet.io/provider
operator: Equal
value: azure
effect: NoSchedule
backoffLimit: 4

Sample application

The code found in the repository has 2 components:

  • Jobs API (/server)
    Web API to manage jobs. Executing of new jobs is redirect depending on the thresholds defined for local cluster workloads.
    The implementation uses the Kubernetes API to schedule jobs. State is obtained from native kubernetes objects and metadata.
    Job input file is copied to Azure Storage. Index files are created to enable parallelism in job execution.
  • Job Worker (/workers/dotnet)
    Sample implementation of job worker.

Starting a new job requires sending a file through the Jobs API. A post request to http://api-url/jobs with the input file (as upload file like the example below) creates a job:

{ "id": "1", "value1": 3123, "value2": 321311, "op": "+" }
{ "id": "2", "value1": 3123, "value2": 321311, "op": "-" }
{ "id": "3", "value1": 3123, "value2": 321311, "op": "/" }
{ "id": "4", "value1": 3123, "value2": 321311, "op": "*" }

The result of the previous request is the jobID. The outcome of a job performed by workers is the following:

{"id":"1","value1":3123.0,"value2":321311.0,"op":"+","result":324434.0}
{"id":"2","value1":3123.0,"value2":321311.0,"op":"-","result":-318188.0}
{"id":"3","value1":3123.0,"value2":321311.0,"op":"/","result":0.009719555197301057}
{"id":"4","value1":3123.0,"value2":321311.0,"op":"*","result":1003454253.0}

As mentioned before the Jobs API will create index files that will identity where each N amount of lines start on the input file. Those small index files will be used by workers to lease a range of lines to be processed, preventing parallel workers from processing the same content.

Running sample application in AKS

  1. Create a new storage account (the sample application will store input and output files there)
  2. Deploy application to your AKS with the provided yaml file. Keep in mind that the deployment will create a new public IP since the service is of type LoadBalancer. Modify the deployment yaml file to contain your storage credentials.

If the target Kubernetes cluster has role based access control (RBAC), we need to give access permission to the batch job api. The file deployment-rbac will create the service account, role and role binding needed before creating the deployment and service:

# with rbac (service account assignment on pod)
kubectl apply -f deployment-rbac.yaml
# without rbac
kubectl apply -f deployment.yaml

3. Watch the Jobs API logs

POD_NAME=$(kubectl get pods -l "app=jobscheduler" -o jsonpath='{.items[0].metadata.name}') && clear && kubectl logs $POD_NAME -f

4. Create a new job by uploading an input file (modify the file path)

curl -X POST \
http://{location-of-jobs-api}/jobs \
-H 'Content-Type: multipart/form-data' \
-H 'cache-control: no-cache' \
-H 'content-type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW' \
-F file=@/path/to/sample_large_work.json

5. Look at the job status

curl http://{location-of-jobs-api}/jobs/{job-id}{  
"id":"2018-10-4610526630846599105",
"status":"Complete",
"startTime":"2018-10-31T12:31:52Z",
"completionTime":"2018-10-31T12:33:57Z",
"succeeded":13,
"parallelism":4,
"parts":13,
"completions":13,
"storageContainer":"jobs",
"storageBlobPrefix":"2018-10/4610526630846599105",
"runningOnAci":true
}

6. Download the job result once finished in parts or as a single file (remove the part query string parameter)

curl http://{location-of-jobs-api}/jobs/{job-id}/results?part=13 --output results-part-13.json

More information

For complete source code, documentation and additional information please check the GitHub repository.

Software Engineer @Microsoft. Opinions are my own.