How to configure RGW multisite in Ceph

Avi Mor
5 min readMar 7, 2020

--

This article is for deploying multisite on ceph-nano. There are a few dangerous commands, so make sure that you understand what are you doing before you implement this on a production environment.

What is Multisite in Ceph Storage?

The multisite configuration gives us the ability to replicate multiple Ceph clusters. The master cluster will replicate the data from one object store cluster to the secondary object store cluster.

Let’s talk a bit about CRR, which is an AWS Feature. Generally, CRR is a self-service that AWS supports for replication of buckets. In CRR, by default, every S3 object that is uploaded to some bucket will be replicated to a bucket in another region.

Basically, with CRR we can choose which bucket to replicate, while with Multisite — we replicate, by default, the entire cluster to another site. However, there is an option to replicate only specific buckets in Ceph (see below). The multisite configuration is done by the Ceph Administrator and for now it’s not self-service like in AWS.

Multisite is an interesting option; we can use it in two configurations:

  • Active/Active replication;
  • Failover and failback.

Things to understand before configuring Multisite:

It is important to say that by default the RGW will not replicate anything. We use the default realm, zone and zonegroup. In order to be able to configure multisite, we will have to know a few things to make this operation work.

What is a Realm, a Zonegroup and a Zone?

Realm: A realm represents the global namespace for all objects and bucket in multisite replication space. Basically, it holds the “Zonegroup”, and Zonegroups hold the zone. In the end, we take 2 clusters and we set something called namespace, and that gives us the option to work on both of them.

Zonegroup: The master Zonegroup will hold few zones: master zone and secondary zone for example. The data will replicate from the master zone to the secondary zone. Important to say, in every zonegroup we have at least one master zone.

Zone: Each zone is backed by at least on radosgw, there can be more.

Different Architectures:

Single zone: Basically it contain one zonegroup and one zone in a realm.

Multizone: Contains one Zonegroup; this zonegroup can contain few zones. We have at least one master zone, and in this way we can replicate the data to the others zones.

Multi zone group: Contains multiple zonegroups. In every zonegroup there is one zone or more.

Multiregion: Contains multiple namespaces, each contains zonegroups and zones.

What are Periods and Epochs?

Basically, every realm (Namespace) has something called Period. In Periods, there is something called epoch, which represents the version of the period.

When will the epochs change? In every change that we make in our cluster, we generate a new version of the epoch.

Demo:

In our Demo, we will use the ceph-nano project for configuring the multisite. You can find more information here: https://github.com/ceph/cn.

The first step is to create 2 Ceph clusters: master and slave. After the containers will be created, we can enter to the container with:

docker exec -it “Container ID” bash

Login to the Master Cluster with the command above.

The first step is to configure a realm. We will need to create a new default realm. By default when the RGW service is started, it searches for the default realm. And the default realm is called “Default”. Now we will override this by simple command:

radosgw-admin realm create --default --rgw-realm=gold

By using — default we can set this realm as default.

Create master zonegroup. Before, you can remove the default one by the command:

radosgw-admin zonegroup delete --rgw-zonegroup=default

Then just create:

radosgw-admin zonegroup create --rgw-zonegroup=us --master --default --endpoint=http://”master_ip”:8000

Please note that we are using “ — master” to set this zonegroup as master.

Create a master zone:

radosgw-admin zone create --rgw-zone=us-east --master--rgw-zonegroup=us --endpoints=http://”master_ip” --access-key=1234567 --secret=098765 --default

Please note that our zone is called: us-east. This zone is going to be the master, because of the “ — master” flag. We also set the endpoint that will serve the requests for this master zone. And finally, we created a replication user.

Let’s create the system user:

radosgw-admin user create --uid=repuser --display-name=”Replication_user” --access-key=1234567 --secret=098765 --system

Please note, remember to set the same access key and secret key as you set in the master zone creation.

Add the configuration into the /etc/ceph/ceph.conf file under rgw section:

rgw zone = us-east

The final step is to commit the change with:

radosgw-admin period update --commit

Please note, we should get the epoch number- which is 1. Also, please restart the container for the changes will be applied.

Configure Secondary Zone:

Please login to the secondary container, which is the “Slave” one:

docker exec -it “Container ID” bash

Pull the realm configuration:

radosgw-admin realm pull --rgw-realm=gold --url=http://”master_ip:8000” --access-key=1234567 --secret=098765 --default

Please note, we set the Ceph Master IP, and also the credentials for the replication user that we created before.

Next step is to pull the period:

radosgw-admin period pull --url=http://”master_ip:8000” --access-key=1234567 --secret=098765

At this point, we are now in the same realm and the same namespace (Realm + Period).

Create a secondary zone:

radosgw-admin zone create --rgw-zone=us-west --rgw-zonegroup=us --endpoints=http://”secondary_ip:8001” --access=1234567 --secret=098765  --default

Optional: We can set a flag of “ — read-only” — it means that clients will be able to upload objects only to the master zone and then the objects will be replicated to the secondary zone. Remember: Metadata actions will also be updated from the master zone.

Add the configuration into /etc/ceph/conf/ceph.conf file under the rgw section:

rgw zone = us-west

Let’s commit the Update by:

radosgw-admin period update --commit

Also, please restart the container for the changes will be applied.

Useful commands:

Check sync status:

radosgw-admin sync status

List all realm:

radosgw-admin realm list

List the periods:

radosgw-admin period list

List the zone groups:

radosgw-admin zonegroup list

List the zones:

radosgw-admin zone list

How to enable/disable replication of a specific bucket to another zone:

radosgw-admin bucket sync [enable/disable] --bucket=<bucket>

How to test?

First create a new user with the command:

radosgw-admin user create --uid "avi" --display-name "avi mor"

We use awcli to interact with the S3 protocol. Download awscli and configure it:

aws configure

Insert the access key and the secret key of the user that you created before.

create a bucket with the master endpoint. Then switch the endpoint to the secondary endpoint.

aws s3 mb s3://avi --endpoint-url http://"master_ip:8000"

on the master cluster:

aws s3 ls s3:// --endpoint-url http://"master_ip:8000"

Output should be:

2020-03-06 20:25:12 avi

on the secondary cluster:

aws s3 ls s3:// --endpoint-url http://"secondary_ip:8001"

Output should be the same as the master cluster. The bucket should be replicated across the clusters.

--

--