Copy files from Kubernetes to S3 and back

Maor Friedman
Nuvo Tech
Published in
3 min readAug 27, 2018

There are plenty of reasons one would need to copy files between Kubernetes and AWS S3 (Simple Storage Service), or any other storage service. One such reason is backup (and restore). That doesn’t sound like a big challenge at first as there is a CLI for each storage service (in AWS’ case, aws cli). But adding the CLI to each container which requires backing up is not a real option, and some backup tasks can only be accomplished if you have access to the application’s file system. That is why we have created a tool for this task.

Hello, Skbn

We call it Skbn, after the Japanese video game Sokoban, which means ‘Warehouse Keeper’. The objective of the game, much like our task, is to move blocks from one place to another. If you never played it, now is a good time.

Skbn can copy files from Kubernetes to S3, and from S3 to Kubernetes. Skbn is written in Go, using Cobra for the CLI, Kubernetes’ client-go and the AWS SDK for Go, so no additional CLI tools are required.

Install

wget -qO- https://github.com/maorfr/skbn/releases/download/0.1.0/skbn.tar.gz | sudo tar xvz -C /usr/local/bin

Usage

Using Skbn is as easy as we could make it. Just set your environment as you would have done (Kubernetes context and AWS_PROFILE for example), and Skbn will pick it up:

skbn cp \
--src k8s://namespace/pod/container/path/to/copy/from \
--dst s3://bucket/path/to/copy/to

Skbn can also copy files in parallel by using the --parallel flag (using go routines). As an added bonus, since we use []byte to transfer data between locations, Skbn can also copy files K8s ← →K8s and S3 ← → S3.

You can use it as a tool for automating backup tasks which run as a Job inside a cluster:

apiVersion: batch/v1
kind: Job
metadata:
labels:
app: skbn
name: skbn
spec:
template:
metadata:
labels:
app: skbn
annotations:
iam.amazonaws.com/role: skbn # We are using kube2iam
spec:
serviceAccountName: skbn
containers:
- name: skbn
image: maorfr/skbn
command: ["skbn"]
args:
- cp
- --src
- k8s://namespace/pod/container/path/to/copy/from
- --dst
- s3://bucket/path/to/copy/to
env:
- name: AWS_REGION
value: eu-central-1

This example will require additional RBAC resources. You can see the complete in-cluster example in the examples section in the repository

We wrote Skbn with all functions exported so you can use it in your own code:

package mainimport (    "log"    "github.com/maorfr/skbn/pkg/skbn")func main() {    src := "k8s://namespace/pod/container/path/to/copy/from"
dst := "s3://bucket/path/to/copy/to"
parallel := 0 // all at once
if err := skbn.Copy(src, dst, parallel); err != nil {
log.Fatal(err)
}
}

Example use case — Cassandra backup

The main challenge with backing up Cassandra is that the backup is created in Cassandra’s file system. There are many working solutions out there (a Job with kubectl, a sidecar with nodetool and access to the same volume, etc.). With Skbn, it is almost an out-of-the-box solution. We are currently working on an operator for Cassandra, but one can easily use the Skbn pkg to create a working solution in no time.

Support for additional cloud providers

Yes, please! If anyone is interested to contribute to Skbn and add support for other storage services, hit us with a pull request. Here is the link to the repository in case you missed it. Keep in mind that adding support for Azure storage (as an example) immediately adds the ability to copy files not just Kubernetes ← → Azure, but also S3 ← → Azure. This can be useful when working with multiple cloud providers. It is also cool!

Conclusion

Let us know what you think of this tool. We think it can be really helpful for many tasks. If you have any comments, feel free to open issues in the repository. Contributions are more than welcome.

Thanks for reading!

--

--