Containerizing WSO2 Middleware
Deploying WSO2 Middleware on Containers in Production
A heavier car may need more fuel for reaching higher speeds than a car of the same spec with less weight . Sports car manufacturers always adhere to this concept and use light weight material such as aluminum  and carbon fiber  for improving fuel efficiency. The same theory may apply to software systems. The heavier the software components, the higher the computation power they need. Traditional virtual machines use a dedicated operating system instance for providing an isolated environment for software applications. This operating system instance needs additional memory, disk and processing power in addition to the computation power needed by the applications. Linux containers solved this problem by reducing the weight of the isolated unit of execution by sharing the host operating system kernel with hundreds of containers. The following diagram illustrates a sample scenario how much resources containers would save compared to virtual machines:
The Containerization Process
The process of deploying a WSO2 middleware on a container is quite straightforward, it’s matter of following few steps;
- Download a Linux OS container image.
- Create a new container image from the above OS image by adding Oracle JDK, product distribution and configurations required.
- Export JAVA_HOME environment variable.
- Make the entry point invoke the wso2server.sh bash script found inside $CARBON_HOME/bin folder.
That’s it! Start a container from the above container image on a container host with a set of host port mappings and you will have a WSO2 server running on a container in no time. Use the container host IP address and the host ports for accessing the services. If HA is needed, create few more container instances following the same approach on multiple container hosts and front those with a load balancer. If clustering is required, either use Well Known Address (WKA) membership scheme  with few limitations or have your own membership scheme for automatically discovering the WSO2 server cluster similar to . The main problem with WKA membership scheme is that, if all well known member containers get terminated, the entire WSO2 server cluster may fail. Even though this may look like a major drawback for most high intensity, medium and large scale deployments, it would work well for low intensity, small scale deployments which are not mission critical. If containers go down due to some reason, either a human or an automated process can bring them back to the proper state. This has been the practice for many years with VMs.
Nevertheless, if the requirement is other way around, and need more high intensity, completely automated, large scale deployment, a container cluster manager and a configuration manager (CM) would provide additional advantages. Let’s see how those can improve the overall productivity of the project and final outcome:
Ideally software systems that run on containers need to have two types of configuration parameters according to twelve-factor app  methodology:
- Product specific and global configurations
- Environment specific configurations
The product specific configurations can be burned into the container image itself and environment specific values can be passed via environment variables. In that way, a single container image can be used in all environments. However currently, Carbon 4 based WSO2 products can only be configured via configuration files and Java system properties. They do not support to be configured via environment variables. Nevertheless if anyone is willing to put some effort on this, an init script can be written to read environment variables at the container startup and update the required configuration files. Generally Carbon 4 based WSO2 middleware have considerable amount of config files and parameters inside them. Therefore this might be a tedious task. According to the current design discussions, in Carbon 5, there will be only one config file in each product and environment variables will be supported OOB.
What if a CM is run in the Container similar to VMs?
Yes, technically that would work, most people would tend to do this with the experience they have with VMs. However IMO containers are designed to work slightly differently than VMs. For an example, if we compare the time it takes to apply a new configuration or a software update by running a CM inside a container vs starting a new container with the new configuration or update, the later is extremely fast. It would take around 20 to 30 seconds to configure a WSO2 product using a CM whereas it would only take few milli-seconds to bring up a new container. The server startup/restart time would be the same in both approaches. Since the container image creation process with layered container images is much efficient and fast this would work very well in most scenarios. Therefore the total config and software update propagation time would be much faster with the second approach.
Choosing a Configuration Manager
There are many different configuration management systems available for applying configurations at the container build time, to name few; Ansible , Puppet , Chef  and Salt . WSO2 has been using Puppet for many years now and currently uses the same for containers. We have simplified the way we use Puppet by incorporating Hiera for separating out configuration data from manifests. Most importantly container image build process does not use a Puppet master, instead it runs Puppet in masterless mode (puppet apply). Therefore even without having much knowledge on Puppet, those can be used easily.
Container Image Build Automation
WSO2 has built an automation process for building container images for WSO2 middleware based on standard Docker image build process and Puppet provisioning. This has simplified the process of managing configuration data with Puppet, executing the build, defining ports and environment variables, product specific runtime configurations and finally optimizing the container image size. WSO2 ships these artifacts  for each product release. Therefore it would be much productive to use these without creating Dockerfiles on your own.
Choosing a Container Cluster Manager
The above figure illustrates how most of the software products are deployed on a container cluster manager in general. The same applies to WSO2 middleware on high level. At the time this article being written, there are only few container cluster managers available. They are Kubernetes, DC/OS, Docker Swarm, Nomad, AWS ECS, GKE and ACE. Out of these, WSO2 has used Kubernetes and DC/OS in production for many deployments.
Strengths of Kubernetes
Kubernetes was born as a result of Google’s experience on running containers at scale for more than a decade. It has covered almost all the key requirements of container cluster management in depth. Therefore it’s the first most preference for me as of today:
- Container grouping
- Container orchestration
- Container to container routing
- Load balancing
- Auto healing
- Horizontal autoscaling
- Rolling updates
- Mounting volumes
- Distributing secrets
- Application health checking
- Resource monitoring and log access
- Identity and authorization
Please read this article on more detailed information on Kubernetes.
Reasons to Choose DC/OS
AFAIU at the moment DC/OS has less features compared to Kubernetes. However it’s a production grade container cluster manager which has been there for some time now. The major advantage I see with DC/OS is the custom scheduler support for BigData and Analytics platforms. This feature is still not there in Kubernetes. Many major Big Data and Analytics platforms such as Spark, Kafka and Cassandra can be deployed on DC/OS with a single CLI command or via the UI.
The Deployment Process
WSO2 has released artifacts required for completely automating containerized deployments on Kubernetes and DC/OS. This includes Puppet Modules  for configuration management, Dockerfiles  for building docker images, container orchestration artifacts ,  for each platform and WSO2 Carbon membership schemes ,  for auto discovering the clusters. Kubernetes artifacts include replication controllers for container orchestration and services for load balancing. For DC/OS we have built Marathon applications for orchestration and parameters for the Marathon load balancer for load balancing. These artifacts are used in many WSO2 production deployments.
Please refer following presentations on detailed information on deploying WSO2 middleware on each platform:
Documentation on WSO2 Puppet Modules, Dockerfiles, Kubernetes and Mesos Artifacts can be found at .