Injecting Fault in Azure Cache for Redis using Azure Chaos Studio through Rest API (Part 2)

Pradip VS
Microsoft Azure
Published in
5 min readApr 26, 2022

This is in continuation to the previous part where I demonstrated how to create a Chaos Experiment to restart shards in Azure Cache for Redis through Portal and monitoring it using portal as well as Redis CLI commands.

In this blog, I will cover how to do the same experiment using REST API step by step, right from enabling targets with necessary capabilities, create experiments with target where the fault needs to be injected, add an appropriate role, and invoke the experiment.

Note: I built these demos using Postman. The same can be done using other REST API based tools and can be automated.

Injecting faults in Azure Redis using Azure Chaos Studio through Rest API

Let us see one by one,

In my experiment svdchaosredis is my redis cache name and chaosrg is the resource group name. Subscription id masked with a dummy one, please use your sub id to run the test successfully.

  1. Enable the Target — Azure Cache for Redis in Azure Chaos Studio (currently only service-direct fault is supported for Azure Redis)
Currently Azure Redis is not enabled to inject faults

PUT
https://management.azure.com/subscriptions/fbe563b4-c548–11ec-9d64–0242ac120002/resourceGroups/chaosrg/providers/microsoft.cache/redis/svdchaosredis/providers/Microsoft.Chaos/targets/microsoft-azureclusteredcacheforredis?api-version=2021-09-15-preview

body (JSON)

{
“properties”: {
“identities”: [
{
“type”: “”,
“subject”: “Redis Restart”
}
]
}
}

Invoking the above will enable (Service Direct) fault in Azure Redis

Now Azure Redis is enabled

If you want to disable the Redis cache from injecting fault post experimentation until the next experiment run, then run the below command

DELETE
https://management.azure.com/subscriptions/fbe563b4-c548–11ec-9d64–0242ac120002/resourceGroups/chaosrg/providers/microsoft.cache/redis/svdchaosredis/providers/Microsoft.Chaos/targets/microsoft-azureclusteredcacheforredis?api-version=2021-09-15-preview

2. Create a capability on the target that needs to be invoked in the experiment

Now, another important factor is, for a given target there are multiple capabilities that can be enabled (for e.g., a VM or AKS can be shutdown, restarted, delayed, congested etc.). In Redis, the capability available as of today is to restart the shards. However, if you are interested in seeing the capabilities, run the below command.

GET
https://management.azure.com/subscriptions/fbe563b4-c548–11ec-9d64–0242ac120002/resourceGroups/chaosrg/providers/microsoft.cache/redis/svdchaosredis/providers/Microsoft.Chaos/targets/microsoft-azureclusteredcacheforredis/capabilities?api-version=2021-09-15-preview

Only Reboot-1.0 capability is available in Azure Chaos Studio for Azure Redis

So now, we have to enable the capability so that we can create an experiment and apply that capability to target.

PUT
https://management.azure.com/subscriptions/fbe563b4-c548–11ec-9d64–0242ac120002/resourceGroups/chaosrg/providers/microsoft.cache/redis/svdchaosredis/providers/Microsoft.Chaos/targets/microsoft-azureclusteredcacheforredis/capabilities/Reboot-1.0?api-version=2021-09-15-preview

body: (JSON)

{

“properties”: {}

}

The capability is now enabled for the target

3. Create an Experiment in West US location with,

a. One Step
b. Under that Step, add a Branch
c. In the branch, add an Action (Fault or Delay), a parameter for it and the target where the Action has to be applied

In my experiment (name: chaosredisRestAPI), I’m restarting shard 0 and after a five-minute delay I’m restarting Shard 1 in the primary node.

PUT
https://management.azure.com/subscriptions/fbe563b4-c548–11ec-9d64–0242ac120002/resourceGroups/chaosrg/providers/Microsoft.Chaos/experiments/chaosredisRestAPI?api-version=2021-09-15-preview

Body (JSON)

{

“type”: “Microsoft.Chaos/experiments”,

“id”: “/subscriptions/fbe563b4-c548–11ec-9d64–0242ac120002/resourceGroups/chaosrg/providers/Microsoft.Chaos/experiments/chaosredisRestAPI”,

“name”: “chaosredisRestAPI”,

“identity”: {

“type”: “SystemAssigned”

},

“location”: “westus”,

“properties”: {

“selectors”: [

{

“type”: “List”,

“id”: “1c97f8f5–6304–4e8c-843e-9ac545175ad3”,

“targets”: [

{

“id”: “/subscriptions/fbe563b4-c548–11ec-9d64–0242ac120002/resourceGroups/chaosrg/providers/microsoft.cache/redis/svdchaosredis/providers/Microsoft.Chaos/targets/microsoft-azureclusteredcacheforredis”,

“type”: “ChaosTarget”

}

]

},

{

“type”: “List”,

“id”: “57e89a62–62da-424a-b09e-7f17ca843ac7”,

“targets”: [

{

“id”: “/subscriptions/fbe563b4-c548–11ec-9d64–0242ac120002/resourceGroups/chaosrg/providers/microsoft.cache/redis/svdchaosredis/providers/Microsoft.Chaos/targets/microsoft-azureclusteredcacheforredis”,

“type”: “ChaosTarget”

}

]

}

],

“steps”: [

{

“name”: “Step 1: Redis”,

“branches”: [

{

“name”: “Branch 1: Redis Restart”,

“actions”: [

{

“selectorId”: “1c97f8f5–6304–4e8c-843e-9ac545175ad3”,

“type”: “discrete”,

“parameters”: [

{

“key”: “rebootType”,

“value”: “PrimaryNode”

},

{

“key”: “shardId”,

“value”: “0”

}

],

“name”: “urn:csci:microsoft:azureClusteredCacheForRedis:reboot/1.0”

},

{

“type”: “delay”,

“duration”: “PT5M”,

“name”: “urn:csci:microsoft:chaosStudio:TimedDelay/1.0”

},

{

“selectorId”: “57e89a62–62da-424a-b09e-7f17ca843ac7”,

“type”: “discrete”,

“parameters”: [

{

“key”: “rebootType”,

“value”: “PrimaryNode”

},

{

“key”: “shardId”,

“value”: “1”

}

],

“name”: “urn:csci:microsoft:azureClusteredCacheForRedis:reboot/1.0”

}

]

}

]

}

]

}

}

Experiment created successfully
you can open and see if the experiment is created with correct steps tagging the right target resource

(In the same branch other faults can be added or one can add additional steps or branches with different set of actions/faults within the same experiment.)

4. Assigning Redis Cache Contributor role for the Chaos Experiment Created

PUT
https://management.azure.com/{{redisscope}}/providers/Microsoft.Authorization/roleAssignments/7805844c-c53a-11ec-9d64-0242ac120002?api-version=2015-07-01

{{redisscope}} is parameterized→
/subscriptions/fbe563b4-c548–11ec-9d64–0242ac120002/resourceGroups/chaosrg/providers/Microsoft.Cache/Redis/svdchaosredis

body (JSON)

{

“properties”: {

“roleDefinitionId”: “/subscriptions/fbe563b4-c548–11ec-9d64–0242ac120002/resourceGroups/chaosrg/providers/Microsoft.Authorization/roleDefinitions/e0f68234–74aa-48ed-b826-c38b57376e17”,

“principalId”: “ece445b4-c548–11ec-9d64–0242ac123456”

}

}

Experiment verified

5. Start/Invoke the experiment

POST
https://management.azure.com/subscriptions/fbe563b4-c548–11ec-9d64–0242ac120002/resourceGroups/chaosrg/providers/Microsoft.Chaos/experiments/chaosredisRestAPI/start?api-version=2021-09-15-preview

The experiment is invoked successfully.

6. Monitor the experiment and verify the results

You can verify if the experiment is successful in the chaos studio by checking the experiment details and also by verifying the metrics. If you would like to see the metrics through Redis cli then please follow the steps given in the previous part.

To conclude, this blog describes how to create and run a chaos experiment for Azure Redis restart scenario end to end using REST API. Kindly let me know your thoughts and any feedback to improve.

Thank you and stay tuned….

Pradip VS

Cloud Solution Architect — Microsoft

--

--

Pradip VS
Microsoft Azure

Architect@Microsoft. I help & co-innovate with the customers in Generative AI, ML, Data Engineering, Analytics, Resiliency Engineering, Data Arch & Strategies.