A walk through of DC/OS container cluster manager
Would it be possible for an entire datacenter to run on a single operating system? Generally, the role of an operating system is to provide access to hardware resources and system software for running software applications. Technically, DC/OS is a container cluster manager which provides a platform for deploying software applications on containers. More importantly, it can run on physical or virtual machines by abstracting out underlying infrastructure from the application layer. DC/OS can be installed on any datacenter on hundreds and thousands of machines, and act as an operating system for its applications. Due to these reasons, DC/OS is considered as a datacenter operating system.
Nevertheless, at a glance, it may sound more like a marketing term than what it actually offers. DC/OS is very much similar to other container cluster managers such as Kubernetes and Docker Swarm. Those systems provide almost the same set of features as DC/OS. However, there is a clear difference between DC/OS and other container cluster managers on deploying Big Data and Analytics solutions. That’s with its ability to extend the scheduler for providing dedicated container scheduling capabilities. For an example systems such as Apache Spark, Apache Storm, Hadoop, Cassandra have implemented Mesos schedulers for running workloads on Mesos for specifically optimizing container scheduling for their needs. More importantly such complex distributed systems can be deployed on Mesos with few clicks. Additional information on that can be found here.
Apache Mesos the DC/OS Kernel
The Mesos core, which is a cluster manager which was initially developed at University of California, Berkeley in around year 2009 and later donated to Apache. It is being now used by many large organizations including Twitter, Airbnb and Apple. As shown in the above figure, Mesos provides an extension point for plugging in task schedulers as Mesos frameworks. The schedulers receive resource offers for scheduling tasks for end user applications. Tasks get scheduled in Mesos slave nodes via executors. These tasks can be executed either using Mesos or Docker containerizers. Mesos containerizer is the first container runtime supported by Mesos which used Linux kernel features such as cgroups and namespaces. Later with the introduction of Docker, Mesos added support for running Docker containers on its cluster manager.
Marathon the PaaS Framework
Marathon comes bundled with DC/OS. It’s the core Mesos framework which provides platform as a service features for DC/OS. End user applications can be deployed on DC/OS using Marathon applications. DC/OS also provides a store full of industry well-known software systems. This is called DC/OS Universe. A Marathon application specifies the resource requirements (CPU, memory), the Docker image id, container ports, service ports (ports to be exposed by the load balancer), networking model (bridge/host), startup parameters, labels and health checks of the software system.
Once an application is deployed Marathon will first check the availability of the resources against the requirement and then schedule containers accordingly. Afterwards it will make use the given health checks to verify the status of the containers and auto heal them if the system is not functioning properly.
Marathon applications will use Marathon load balancer (which is haproxy) for exposing service ports and load balancing containers. It will be deployed as a separate Marathon application and will make use of the Marathon API for dynamically updating the load balancer configuration. Marathon applications can define hostnames for their clusters for enabling hostname based routing. Marathon also provides DNS names for Marathon applications via Mesos DNS server.
The above diagram illustrates the high-level architecture of DC/OS. Initially, DC/OS solution was implemented as a commercial offering and later in April 2016, it was open sourced. I believe this was a critical business decision taken by Mesosphere for competing with Kubernetes. Unless it was open sourced, it may not have got much attention and traction compared to an open source project.
As of DC/OS 1.7, the following limitations were identified:
- No overlay network support. As a result, each container port needs to be exposed via host ports and internal communication get proxied.
- No service concept similar to Kubernetes services. Therefore Marathon applications need to be load balanced via Marathon load balancer even when session affinity is not required. Kubernetes services do network level routing using IP table rules and it’s much faster than traditional load balancers.
I worked on designing and implementing several production grade middleware deployments on DC/OS 1.7 and those were quite stable. DC/OS provides a vagrant deployment script and many other installation guides for setting it up without much effort. One of the key reasons for choosing DC/OS over Kubernetes in above projects was the support for Big Data and Analytics platforms. At the time this article being written Mesosphere is implementing a layer 4 load balancer called Minuteman for DC/OS. This would overcome the limitations mentioned above in future DC/OS releases.