Apache Pulsar: Geo-replication — Synchronous Replication : Hybrid Deployment Model

Karthikeyan Palanivelu
4 min readMay 2, 2019

--

This is second part of Apache Pulsar: Geo-replication series explaining on standing up a hybrid model to achieve Synchronous Replication.

If you are here directly, Please read about features of Apache Pulsar and Geo-replication article which covers the out of the box replication models in detail.

In a Primer, Apache Pulsar out of the box supports geo-replication. Out of the box, pulsar supports below n-mesh replication patterns:

· Asynchronous Replication

· Synchronous Replication

Asynchronous Replication

Asynchronous replication as it explains itself, data published to the topic gets replicated across configured clusters located in different regions asynchronously. Asynchronous replication can be configured at namespace level for which the tenant has access to it. Producers are acknowledged as soon as message is published/persisted locally to the cluster. Broker then replicates data to the configured replication cluster. When the data is replicated, it preserves the order but not cursor position. Note: Cursor Position is preserved and maintained within the local cluster.

Asynchronous Replication (Active-Active)

This deployment/configurable pattern is recommended for

1) Applications with Idempotent data.

2) Applications whose backend data is not replicated to maintain replica in a different region.

3) Applications that can accept duplicate messages till the time of recovery is reached.

Synchronous Replication

Synchronous Replication

Synchronous replication is achieved by Apache Bookkeeper. Synchronous replication provides greater flexibility for the enterprise to have one cluster. Main configuration is to enable the rackAwarePlacementPolicy on the broker to persist data in bookies across different datacenter hosted across geographical locations. This feature is out of the box from pulsar.

To configure Synchronous Replication, option is to provision Zookeeper and Pulsar on bare metal based on the instructions at http://pulsar.apache.org/docs/en/deploy-bare-metal-multi-cluster/.

Shortcomings to this approach is:

  • Containerization and the flexibility of using Containers
  • Maintenance of clusters based on organization needs
  • Cloud governance and agility (Rigid Model).
  • Resource Utilization
  • Not Cost effective

Synchronous Replication — Hybrid Model

To overcome the above complexities of maintenance and resource sharing, a new deployment model is proposed.

Hybrid model — The provisioning that happens both on bare metal and Kubernetes cluster (either EKS, GKE or Standalone Kubernetes Cluster).

Assumptions:

· Zookeeper can be installed using the Apache Pulsar binary on bare metal.

· Containers of Broker and Bookie are available.

This model proposes to provision the components as described below:

· Zookeeper on Bare Metal/EC2(AWS)

· Bookies and Brokers on Kubernetes

Provision a Zookeeper on three regions, for example we can choose — AWS regions us-east, us-west and us-central with total of 3 zookeeper nodes to form a cluster or zookeeper quorum.

Synchronous Replication — Hybrid

Below are the configurations that are required for achieving Synchronous replication:

  • Configure the Cluster Metadata of Pulsar on one of the nodes with say Cluster name as “hybrid” using initialize-cluster-metadata command.
  • Provision Kubernetes cluster in each of the regions like us-east, us-west and us-central. Deploy bookies and brokers to kubernetes cluster say for example 3 replicas each.
  • Configure zookeeper nodes as zookeeper servers in bookies configuration.
  • Configure zookeeper nodes as zookeeper servers and Configuration Store servers in broker configurations.
  • Enable Rack aware policy

With the above configuration along with Write Quorum which should be AckQuorum+1 will let bookkeeper to replicate the data across different geographical locations/regions.

Advantages of using this model:

  • Resource Sharing to reduce cost compared to bare metal.
  • Behaves as a single logical cluster with as many physical clusters in as many regions.
  • n-Number of clusters can be started from any number of regions.
  • Zero data loss
  • Resilient Model (Topics) — If a cluster goes down, topics are transferred seamlessly to another cluster
  • Resilient Model (Messages) — If a cluster goes down, Messages are replicated based on write quorum to bookies on another cluster.
  • Consumer can consume messages from any region/cluster.
  • No Duplicate Messages produced during failover unlike Asynchronous replication.
  • Preserves the Cursor Position unlike other models as this behaves like a single logical cluster.

Main disadvantage of this model is

  • Maintenance of two different provisioning architectures EC2 and Kubernetes. If EKS or GKE is used, then it is easy model to achieve Synchronous replication using hybrid framework.
  • Latency between Zookeeper hops based on leader during Topic Creation/Deletion. This does not impact message lifecycle.

Conclusion

Apache Pulsar comes with lot of features to play with to be a strong player in the Streaming platform for years to come. One strong feature is Replication out of the box for different needs. Hybrid approach is to maximize the resource utilization and cost for being on public cloud. Please feel free to provide feedback to enhance this article.

Thanks to Pulsar Team for supporting us in exploring its features.

Happy Pulsaring!

--

--