How to achieve 100% availability on WSO2 product deployments

Chanaka Fernando
WSO2 Best Practices
4 min readOct 9, 2017

--

WSO2 products comes with several different components. These components can be configured through different configurations. Once the system is moved into production, it is quintissential that system needs to go through various updated and upgrades during it’s life time. There are 3 main configurations related to WSO2 products.

  • Database configurations
  • Server configurations
  • Implementation code

Any one or all of the above configuration components can be changed during an update/upgrade to the production system. In order to keep the system 100% available, we need to make sure that product update/upgrade processes does not impact the availability of the production system. We can identify different scenarios which can challenge the availability of the system. During these situations, users can follow the guidelines mentioned below so that system runs smoothly without any interruptions.

During outage of server(s)

  • We need to have redundancy (HA) in the system in terms of active/active mode. In a 2 node setup, if 1 node goes down, there must be a node which can hold the traffic for both nodes for some time. Users may get some slowness, but system will be available. During the capacity planning of the system, we must make sure that at least 75% of the overall load can be handled from 1 active node.
  • If we have active/passive mode in a 2 node setup, each node should be capable of handling the load separately and passive node should be in hot-standby mode. Which means that passive node must keep on running even though it does not get traffic.
  • If an entire data center goes down, then we should have a Disaster Recovery (DR) in a separate data center with the same setup. This can be in a cold-standby mode since these type of outages are very rare. But if we go with cold standby, there will be a time window of service unavailability

Adding a new service (API)

  • Database sharing needs to be properly done through master-datasources.xml file and through registry sharing
  • File system sharing needs to be done so that deployment is one time and other nodes will get the artifacts through file sharing
  • Service deployments needs to be done from one node (manager node) and other nodes needs to be configured in read-only mode (to avoid conflicts)
  • Use passive node as manager node (If you have active/passive mode)
  • Once the services are deployed in all the nodes, do a test and expose the service (API) to the consumers

Updating an existing service (fixing a bug)

  • Bring one additional passive node to the system with existing version of services. This is in case if the active node goes down while updating the service on first passive node (system will be 1 active/ 2 passive)
  • Disable the file sharing (rsync) in passive node.
  • Deploy the patched version and carry out testing into this passive node
  • Once the testing is passed, allow traffic into passive node and stop traffic from active node.
  • Enable file sharing and allow active node to synced up with the patched version. If you don’t have file sharing, you need to manually deploy the service.
  • Carry out testing on other node and once it is passed, allow traffic into new node (if required)
    Remove the secondary passive node from the system (system will be 1 active/ 1 passive)

Applying a patch to the server (needs a restart)

  • Bring one additional passive node to the system with existing version of services. This is in case if the active node goes down while applying the patch on first passive node (system will be 1 active/ 2 passive)
  • Apply the patch on first passive node and carry out testing
  • Once the testing is done enable traffic into this node and remove traffic from active node
  • Apply the patch on active node and carry out testing
  • Once the testing is done, enable traffic into this node and remove traffic from previous node (or you can keep this node as active)
  • Remove the secondary passive node from the system (system will be 1 active/ 1 passive)

Doing a version upgrade to the server

  • Bring one additional passive node to the system with existing version of services. This is in case if the active node goes down while applying the patch on first passive node (system will be 1 active/ 2 passive)
  • Execute the migration scripts provided in WSO2 documentation to move the databases to the new version in passive node
  • Deploy the artifacts in the new version in passive node
  • Do a testing on this passive node and once testing is passed, expose traffic into this node
  • Follow the same steps into the active node
  • Once the testing is done, direct the traffic into this node (if required)

Instead of maintaining the production system through manual processes, WSO2 provides artifacts which can be used to automate the deployment and scalability of the production system through docker and kubernetes.

Deployment automation

  • WSO2 provides puppet scripts which can be used to automate the server deployments in matter of few seconds
  • WSO2 provides artifacts for containerized deployments
  • Kubernetes can be used to make the system auto scale and make deployment of new instances automated

https://github.com/wso2/puppet-apim
https://github.com/wso2/docker-apim
https://github.com/wso2/kubernetes-apim/tree/2.1.0

--

--

Chanaka Fernando
WSO2 Best Practices

Writes about Microservices, APIs, and Integration. Author of “Designing Microservices Platforms with NATS” and "Solution Architecture Patterns for Enterprise"