Cloud agnostic S3 Buckets using Ceph

Arko Basu
9 min readJun 18, 2024

--

In the world of scalable object storage, AWS S3 stands out for its reliability and user-friendly design, making it a top choice for enterprises, developers, and system engineers alike. Its success has inspired similar services from other cloud providers, such as GCP’s Cloud Storage and Cloudflare’s R2 to name a few, all building on the S3 framework.

But what if you could enjoy similar functionality with the freedom of being cloud-agnostic, without modifying your existing application code or learning new cloud-native services or having to worry about cross cloud migration? Meet Ceph, a highly resilient, open-source storage solution capable of emulating S3 functionality. This guide will show you how to set up your own S3-compatible storage using Ceph in minimal steps, giving you a taste of what it is like to have complete control and dominance over every aspect of your data/storage layer.

Prelude

If you’ve read my other articles, you know I adore Ceph. It’s a versatile and reliable open-source data storage solution that’s become my go-to for every data need. Ceph is scalable, resilient, and unbelievably efficient, seamlessly integrating with any existing infrastructure. I am not going to dive deep in product review on Ceph, for that check out my previous posts and other writers on Medium.

In earlier blogs, I covered single-node deployments, using Ceph as a Kubernetes (K8s) native storage provider with Rook. I discussed using its Block Device capabilities for stateful applications like WordPress and Keycloak, but only in ReadWriteOnce Access Mode without any High Availability (HA).

In this article, we’ll explore a more robust, HA system that matches a minimal “production” deployment. We’ll also delve into Ceph’s Object Gateway capability to emulate S3 infrastructure within a Ceph cluster comprising of 3 bare-metal nodes, the minimum required for HA.

Objective

To provide the reader with a path of least resistance to build/deploy a “production” grade HA Ceph environment with Object Gateway capability to emulate S3 infrastructure. At the high level our deployment would like the following:

A HA deployment of Ceph where Admins can create S3 compatible resources

We are going to cover:

  1. The shortest path to deploy a high availability multi-node Ceph cluster comprising of 3 bare metal nodes.
  2. Create a Ceph Object Gateway that is AWS S3 compatible.
    Even though Ceph’s object gateway supports Swift API we are not going to cover that and rather focus only on a single zone S3 compatible gateway. I am not going to cover Multi-zone gateway either — since even a single zone Object Gateway maintains data resiliency and high availability as the underlying Ceph cluster is going to be HA. We are also not going to cover HTTPs and secure access from outside the private network boundary. Perhaps in a follow up article later — we shall see.
  3. Create S3 resources inside the Ceph Cluster using AWS CLI Tools
    For example User(s), Access credential(s) for the Object Gateway, and S3 Buckets. Please note that these users need to be within the network boundary where this ceph cluster is being deployed. You can use VPN or even a cloudflare tunnel protected by WARP Clients if you wanna give them access from outside the network. But please do not open your Object Gateway without the necessary security precautions — which is not covered in this article.
  4. (Bonus) Use S3 functionality to create time limited pre-signed download links for files in the Object Gateway. I will use AWS CLI to demostrate it, but this should explain how you can use something like Boto3 to implement S3 functionality using your Ceph cluster’s Object Gateway.

This is not an actual “production” environment and should not be used as a reference for one.

I am using some inexpensive low power usage SBCs with NVMe Drives and a 1 gig Ethernet uplink only. The total all inclusive costs for building this 3 node cluster is less than 400 USD that in total provides me a 1TB+ storage space, 12 CPU core count with 24 Gigs of LDDR4 RAM. Within the context of Ceph — the ratio is usually 1 Gig RAM for each TB of storage for good operational throughput, which gives our nodes enough room to expand should we wanna swap out the NVMe drives later for a higher storage. Since Ceph is self-healing we can take out individual drives and upgrade them without ever having to worry about losing our data.

A real enterprise grade “production” environment, just from the hardware perspective, would not look like this. One would expect not to use SBCs, have at least 5 nodes each with significantly higher CPU core count per node (especially ones which run MON and MGR) that support Hyper-threading, and preferably a 5 or 10 gig ethernet uplink with multiple drives per node.

All the steps covered here should work on Virtual Machines setup with something like let’s say Microcloud, Proxmox, and/or VMware ESXI like environments too.

So let’s get started.

Step 1: High-Availability Multi-Node Ceph Cluster

In this section I am more so interested in forewarning you over using the existing methods of deploying a Ceph cluster. I mentioned on the objective that I wanna provide you with the shortest path to doing this, and I mean it. Just look at the available options for deploying a Ceph cluster. Having deployed over 50+ clusters now, I can recommend you to do yourself a favor and not use any of them. Below is an overview of my personal opinion on the three most common tools for ceph cluster management that I have used myself:

  • Rook-Ceph is one of the most stable and well-tested methods for deploying and managing a Ceph cluster at any scale. It boasts a very helpful community, with maintainers actively participating in a Slack channel where you can get answers quickly, even on weekends. However, it is resource-intensive due to the addition of Kubernetes and many containers. For K8s beginners, the deployment process can be quite challenging.
  • Cephadm is complex and has a steep learning curve, requiring dependencies like PodMan/Docker, LVM2, and Python, making upgrades challenging. While I have mixed feelings about Python, I can’t do without it. Don’t come at me.
    Cephadm, the official tool for managing Ceph Clusters, works well but adds complexity, especially for initial testing on low-cost SBCs (like Raspberry and Orange Pis) and old desktops with low CPU power. Ceph needs dedicated computational resources, and these dependencies make it harder to run, maintain, and upgrade/patch on smaller devices.
  • Ansible-Playbooks is also a decent viable options if you are having to provision large scale production clusters that are hundreds of nodes wide spanning different geographies. And may be we can do a follow up where we use Ansible as a tool and retrofit these payblooks with some custom tooling to adapt with the method I am pitching in this article to provision larger scale HA ceph clusters. But trust me when I say this — you don’t want to maintain code for a 3 node HA ceph deployment, which in case you used Ansible, you would have to. I said the shortest path, and not the hardest path.

There are some other tools as well like Salts, Charms, Juju controllers, and manual deployment of course, but they all have a steep learning curves in order for one to leverage ceph’s full capability.

Enter Microceph!

Image from official docs

MicroCeph is an opinionated Ceph deployment, with minimal setup and maintenance overhead, delivered as a Snap. Snaps provide a secure and scalable way to deploy applications on Linux. Any application, like Ceph, is containerised along with all of its dependencies and run fully sandboxed to minimise security risks. Software updates are hassle-free, and respect the operational requirements of a running Ceph cluster. — Docs

It’s a Canonical product. The same guys behind one of the largest Linux distributions on the planet. It’s absolutely beautiful. And you will see why in a follow up article when we talk about using this HA cluster across different applications.

Follow the instructions from my Github gist to deploy a HA ceph cluster in less than 10 minutes, if you have all the hardware. If you have past experience with Ansible — then by the end of the deployment you’d see what I meant earlier when I talked about building your own playbook to automate deployment across large scale number of nodes.

Some post deployment snapshots of a HA ceph cluster

Now that we have spun up the HA ceph cluster, let’s get on with our next step.

Step 2: Deploy a S3 compatible Ceph Object/Rados Gateway

This is as simple as running 3 commands from any of your Ceph Cluster nodes. Follow the instructions here. You can quickly make a local curl from the host you enabled it on to validate the S3 compatible gateway.

On to setting up some S3 resources.

Step 3: Create S3 resources in Ceph Cluster

As an administrator of the storage cluster you have complete autonomy and control of implementing fine grained access control using Ceph for various parts of your infrastructure. But what’s even cooler — is the ability to create S3 compatible resources for your Object/Rados Gateway.

After you have enabled the s3 compatible object/rados gateway in your cluster you can follow the few simple steps outlined in the instructions here to create an admin user with which you can do some testing of the S3 capabilities on the object gateway using AWS CLI.

If you have followed all the instructions from the gists — congratulations, you now have a full S3 compatible Object storage that you can use from your local system, or from applications that are running within the network boundary where Ceph Cluster and the Object gateway is reachable from.

In a follow up article we will cover in more depth Ceph’s Object Gateway administration capabilities using Rook-Ceph from a K8s cluster that can externally connect to this Ceph cluster.

(Bonus) Step 4: Create Pre-signed URLs for S3 objects

Now that you have a functional S3 endpoint you can use some of it’s more advanced capabilities should you like.

Suppose you are self-hosting a custom generative AI application and need to share temporary access to some important S3 objects without granting full access to your S3 bucket or creating new Ceph credentials. You can create pre-signed URLs for your S3 objects, just like on AWS S3, using CLI tools or SDKs. As long as users are within your network’s reach of the Ceph Rados/Object Gateway, they can access the files via a browser. You can even control the duration of the signed URL’s accessibility. Simply run the following commands:

# Create a presigned URL for a specific duration
aws --profile <your-user> s3 presign s3://<bucket-name>/<object> --expires-in <number-of-seconds>
Generate Presigned URLs and test timed access

In the above snapshot you see me generating a temporary pre-signed URL of an AI generated image I uploaded to the S3 bucket earlier and that expired in 5 min.

Conclusion

With that we come to a close for this article. I hope you have enjoyed reading this article as much as I had writing it. As you see, with Ceph you have a turn key data storage solution with minimal setup capable of running on any commodity/cloud/edge hardware. Which then you can use for a vast array of things. In a future article we will cover in depth means by which you can utilize this HA ceph cluster with tightened access control policies across a wide array of applications. I have only covered Block and Object Gateway leaving aside probably the most important one. Ceph File Systems — which allows a user to have NFS and Kubernetes native ReadWriteMany Access Modes storage provider. My hope is to build on this foundation to create a series that expands on Ceph to give a reader a comprehensive overview of the entire product with it’s vast capabilities. Please don’t hesitate to reach out with suggestions. I would really appreciate it.

References

[1] https://cloud.google.com/storage/docs/migrating

[2] https://www.cloudflare.com/developer-platform/r2/

--

--