EXPEDIA GROUP TECHNOLOGY — SOFTWARE

Kafka Blue-Green Deployment

How to deploy a new version of a Kafka cluster in a production environment using a blue-green deployment

David Virgil

Published in

Expedia Group Technology

7 min readDec 5, 2019

One of the most challenging and error prone tasks that I have faced as a developer, is performing software/infrastructure upgrades in a production Apache Kafka cluster. This post will explore two different approaches that we have used while working on Expedia Partner Solutions™ (Part of Expedia Group™) data lake.

Batch deployment

The first approach I took was to make partial updates of the cluster and then doing reassignment of partitions. This approach involves batch updates of the production cluster:

Moving all the traffic from one batch of nodes to the rest of the nodes (the intact ones). To do this you have to use a Kafka command to reassign partitions.
Updating the batch while the rest of the nodes remain the same and all the traffic has been redirected to them.
Doing the same with the rest of the batches, reassigning partitions and installing the new software/hardware.

As you can see this process is quite manual and dangerous. One of the consequences of this approach is that the Kafka cluster would have more load and it would incur into several topic rebalances.

For example, imagine that you have a production Kafka cluster of 100 brokers. Maybe you have to use update batches of 10 nodes for not affecting the incoming Kafka traffic. This process would involve 10 reassignment of partitions and 10 batch updates.

If you’re interested to know more about Kafka batch deployment and topic rebalancing I have covered it in detail in another blog post.

Blue-green Kafka deployment

On the other hand, the blue-green deployment technique minimizes the risks of the batching approach explained previously. It consists of creating a group of new instances and when the new instances are ready, the producer traffic is switched over to the new cluster.

The following diagram shows the architecture to implement.

Kafka Blue-green architecture in a AWS environment.

The main idea behind it is to have two Kafka clusters running in parallel and switch the traffic by updating the Route53 DNS record to point to the new cluster. The old cluster is not removed immediately. It is necessary to wait until all the traffic from the producers has been switched and all the consumers have processed all the topics. This means that for all the topics in the old Kafka cluster there is zero lag.

Why we did not use AWS CodeDeploy

AWS CodeDeploy has an option to make blue-green deployments. It allows deploying a new version of your code/application without affecting users. So, it deploys the code in new instances and once all the instances are ready it unsubscribes the old instances from the load balancer and subscribes the new instances to the autoscaling group.

Creation of Codedeploy deployment in AWS.

Once the instances are being deregistered from the load balancer, you can choose to terminate them or keep them.

CodeDeploy works fine when a new version of the application is being deployed. The blue-green deployment really does what it is expected, but in the case of infrastructure changes, CodeDeploy blue-green deployment is not helping much. When AWS CloudFormation makes a launch configuration update to an existing Autoscaling group the existing EC2 instances are terminated and new ones are created. And obviously this is something that is not acceptable in a stateful service like Kafka, so for this reason we do not use CodeDeploy for Kafka deployments.

It’s worth mentioning though that CodeDeploy has many interesting features including all-at-once / one-by-one deployments, rollbacks and more. A full discussion is beyond the scope of this article so I will write about it in a future post.

Blue-green implementation

Blue-green deployment requires us to deploy a new Kafka cluster in production. In order to be able to deploy two different Kafka clusters in production with the same CloudFormation codebase, we modified our CloudFormation code to include a function suffix in the name of all of our resources.

Kafka has a direct dependency on Zookeeper. Zookeeper contains information about the partitions, brokers, partition leaders… Obviously the Zookeeper connect url has to be different between the two Kafka clusters. The only thing that needs to be changed is the property zookeeper.connect inside of server.properties:

zookeeper.connect=${Comma separated list of hostnames}/${zook_path}

Zookeeper allows using the same Zookeeper cluster by different Kafka clusters. The only thing that has to be modified is the zook_path at the end of the Zookeeper connection.

The following diagram shows the architecture of a Kafka cluster.

Multi AZ Kafka Cluster Architecture in AWS

As you can see there is a load balancer that balances the traffic across all the Kafka brokers. There is as well a Route53 DNS record that points to the Network Load Balancer. For switching the traffic for one cluster to another, just change the Route53 DNS endpoint to point to the new Kafka cluster Load Balancer. Check out the following diagram:

It is important to set up the Route53 TTL property accordingly. The Time To Live property allows clients to maintain cached the Route53 endpoint IP address for a time. If the TTL property is set to 10 minutes, in case of a DNS endpoint change, it won’t be effective at least until the TTL has been expired. The next image shows how to configure the TTL in a AWS Route 53:

It is important to understand how the TTL works and how the Kafka client library can be affected. If the TTL is 100 seconds, then the operating system caches the IP address associated and it always responses with the cached IP address during the TTL.

Producer code changes

Normally the Kafka producers do not create a new connection to Kafka every time they send a new record to Kafka. The producers implement a singleton pattern meaning that a single KafkaClient instance is created when the producer service starts up. With this kind of producer implementation, even if the DNS endpoint has been switched to the new Kafka cluster, producers are not aware of this change, until the service is redeployed or restarted. To solve this problem, we use a scheduled thread that periodically replaces the producer Kafka client connection:

public class BlueGreenKafkaProducer{
  scheduledService = Executors.newScheduledThreadPool(1);
  runnable = new BlueGreenKafkaSwitcher(refreshTime, this);
  scheduledService.scheduleAtFixedRate(runnable, INITIAL_DELAY, PERIOD, TimeUnit.MILLISECONDS);protected void switchClient(){
      if(inUse == null){
          logger.info("Initializing Blue Kafka Producer");
          blue = new KafkaProducer(producerProperties);
          inUse = BlueGreenKafkaEnum.BLUE;
      }
      else{
          logger.info("Kafka Producer Switch");
          if(inUse == BlueGreenKafkaEnum.BLUE){
              logger.info("Switching from Kafka Producer Blue to Green");
              green = new KafkaProducer(producerProperties);
              inUse = BlueGreenKafkaEnum.GREEN;
              blue.close();
              blue = null;          }
          else{
              logger.info("Switching from Kafka Producer Green to Blue");
              blue = new KafkaProducer(producerProperties);
              inUse = BlueGreenKafkaEnum.BLUE;              green.close();
              green = null;          
          }
      }
  }   private KafkaProducer getProducer(){
        switch (inUse){
            case BLUE: return blue;
            case GREEN: return green;
            default: return blue;
        }
    }
    public boolean send(ProducerRecord record){
        return getProducer().send(record);
    }public boolean send(Iterable<? extends ProducerRecord> records){
        return getProducer().send(records);
    }
}

In the code snippet above:

The second line creates a new Scheduled Thread.
In the third line we define the Runnable object that will contain the logic to switch the Kafka Client.
On the fourth line we schedule the runnable to run at fixed intervals.
The seventh line shows the switcher method. This method is being called by the the Runnable instance that will make the client switch.
It is important to close the previous connections to avoid keep opening threads all the time.

public class BlueGreenKafkaSwitcher implements Runnable {
  public void run() { 
    bluegreen.switchClient();
  }
}

Configuration

Our library every n seconds has to create a new connection to Kafka to check if the endpoint has changed. The previous piece of code does this, but we need to find the right scheduling period. I would say that every 30 minutes is fine. So every 30 minutes the switcher will be called.

Remember to set a scheduler period greater than the TTL, because in the other case it could happen that the switcher would change to the cached endpoint value.

Deployment steps

Doing a Blue-green Kafka deployment requires a series of steps:

Include the library changes and make sure that all the producers include this library in their classpath.
Deploy a new Kafka cluster that contains some infrastructure/software upgrades.
Deploy the connectors/consumers for the new Kafka cluster.
Modify the original Route53 to point to the load balancer of the upgraded Kafka cluster.
Wait until all the producers have switched their traffic (depends on the switcher period property).
Wait until all the consumers from the initial Kafka cluster have zero lag. This means that they have consumed all the data.
At this point we can delete the old Kafka cluster.

Conclusion

Kafka blue-green deploys have a lot of moving parts, but are still less complicated than Kafka batch deploys as we don’t have to worry about topic rebalancing and service availability. Because blue-green is easier to automate our deployments are more reliable and there is less chance of making a human error. Following the approach above will ensure zero downtime and zero data loss.

Originally published at https://dvirgiln.github.io.