Simply Put: Deploying Microsoft Orleans To Docker Swarm

6 min readSep 29, 2017

Docker is something that I just can’t get enough of, it’s taken up all my time since seeing how easy it was to deploy third party dependencies such as RabbitMQ (yes, I am lazy, I hate installing things!). I eventually progressed onto moving my own applications into containers. This was incredibly fun, being able to run an application, with external dependencies in one command? madness. Naturally, the next step would be to move to something more distributed, with Microsoft Orleans being my first choice.

So what are we going do attempt to do here? this is what I am aiming to explain.

So how are we going to do this? we’re going to do a few things — create a swarm cluster of 4 nodes, define our docker-compose file, and triggering that deployment against those nodes in the swarm cluster.

Creating A Swarm Cluster Locally

So the first step would be to create some virtual machines, to simulate a clustered environment. We’re going to be using Docker Machine to do this. Docker Machine is part of docker that allows you to create virtual machines with docker installed; It also provides tools to manage these virtual machines. For these examples, I’ll be using Hyper-V, and cmd for running all these commands, so please change accordingly to your environment. It will be worth following this guide on creating a virtual switch, whenever I reference “dSwitch” in the following commands, it’s the name of the switch I created in Hyper-V!

Lets first create our manager node, this will schedule our services/containers across our other swarm nodes.

docker-machine create -d hyperv — hyperv-virtual-switch “dSwitch” managerNode

Now lets verify that it got created and grab the IP address for that node by using

docker-machine ls

Before we continue, lets point our shell to managerNode, so we can execute docker commands as if it were local, but actually executing on the managerNode VM. This can be done by running

docker-machine env managerNode

Then copying the command from the output of that, and running it.

Now, using the IP address returned for our “managerNode” instance, we’re going to use that for our next command:

docker swarm init --advertise-addr {ip}

This will initialise docker swarm, and make this a manager node, good job! Let’s now create other virtual machines, which are workers, that join part of the docker swarm cluster we’ve just created.

for /l %x in (1, 1, 3) do docker-machine create -d hyperv --hyperv-virtual-switch "dSwitch" node%x

We’ve simply looped in cmd, running docker-machine create (with our defined switch), easy enough; lets now make these join the swarm cluster.

for /l %x in (1, 1, 3) do docker-machine ssh node%x "docker swarm join --token {token} {managerNode ip}"

It’s important to note, the token we’re passing here is what was returned when running docker swarm init on the managerNode!

Let’s now check that all these nodes have been created and part of the swarm, run…

docker node ls

Orleans, Membership And ZooKeeper

For this deployment, we need some way of maintaining a list of silos in our Orleans cluster. We will be using ZooKeeper here. If you’re a bit unsure on how membership works, there’s some great information in the Orleans documentation. Needless to say, ZooKeeper is a dependency on making this work, so let’s configure our silo to use this, and where it can reach it.

I’ll be using the amended ToasterService template to achieve this, it can be found here. The main change is how the silo is configured, you can see it here

it’s clear that we’re saying that the LivenessType is ZooKeeper and that the endpoint to reach it is “zookeeper:2181”. The “zookeeper” endpoint will be resolved by docker.

Prep For Deployment

Before we can start the fun part; deploying, we need to do some work to allow all the swarm nodes know were to get our ToasterService docker image from. To do this, we’re going publish our application, build our image, create a registry and push our image to that registry.

Firstly, let’s publish our application, so docker knows what to copy into the container (what we’re copying is defined in our Dockerfile)

The above image should give enough information on how to do this! Once your application has been published, run the following command to build the image.

docker-compose build

Now we have our image, we need to publish it to a registry. This registry needs to be available to all nodes in swarm — meaning we have to host a registry within our swarm cluster — lets create this registry by running

docker service create --name registry --publish 5000:5000 registry

Now lets tag our toasterservice image we just built so we can push it to our registry.

docker tag toasterservice localhost:5000/toasterservice

Followed by this command to actually push it to the registry

docker push localhost:5000/toasterservice

Lets Start Deploying!

So we have our Orleans silo configured to allow for clustered deployment (through membership with ZooKeeper), and we have our docker swarm set up, waiting for work. First off, let’s start defining our docker-compose.yml file, so we know what we’re deploying.

The main differences here from the existing compose file is that the image is pointing to our registry we deployed to swarm. We’re also saying that we want 5 replicas of our toasterservice container — a cluster of 5 silos! I call this docker-stack just to differentiate between compose and stack deployments.

Sweet, so we’re good to go, let’s kick this deployment off with the following command

docker stack deploy -c docker-stack.yml toasterservice

Here we’re going to be using our docker-stack.yml file, which will deploy all services defined, under the name “toasterservice” — that’s it, your service is now deployed! But, how can we check that it successfully deployed and that I am not just lying to you? Let’s run a command to check the state of our toasterservice deployment…

docker stack ps toasterservice

This will list all containers under this deployment, including failed containers. If you do have a failed container, it will most likely be due to the fact that ZooKeeper might not of been available at the time the silo started. In this scenario, the container will be redeployed a few more times in an attempt to reconcile our “desired state” we defined in our compose file (5 replicas).

Bonus: Lets Prove It’s Actually Clustered…

I mean, I could easily deploy all of this and say that this is a valid Orleans cluster, when it isn’t — but lets prove it. How? we will kill a silo, and see if our other silos notices this and marks it as dead…

Let’s choose our victim… I’ll get all running nodes and choose which silo to kill

I like the look of toasterservice.3, which is sitting on swarm node1. Lets ssh into node1 and stop that container.

docker-machine ssh node1

now we can run list all containers on node1, grab the ContainerId for toasterservice.3

If I inspect that container, I can grab the IP Address for it, for me the IP Address is “10.0.0.5”. Lets now kill it by stopping the container — by running

docker stop [ContainerId]

(type exit to get out of node1)

Now, if I grep some of the logs on a silo, I can see that they voted for the silo with that IP address, to be marked as DEAD