S5cmd for High Performance Object Storage

Joshua Robinson
Aug 5 · 8 min read

Why Use S5cmd

Installation and Usage:

[default]
aws_access_key_id = XXXXXXX
aws_secret_access_key = YYYYYYYY
go get -u github.com/peakgames/s5cmd
> s5cmd --endpoint-url http://10.62.64.200 ls s3://joshuarobinson/
+ DIR backup/
+ 2017/10/13 10:31:29 73 people.json
+ 2019/07/10 12:39:43 53687091200 two.txt
2019/08/02 03:08:13 +OK “ls s3://joshuarobinson” (13)
> s5cmd --endpoint-url http://10.62.64.200:80 -uw 64 cp /source s3://joshuarobinson/dest
alias s5cmd='s5cmd --endpoint-url http://10.62.64.200 -dw 32 -uw 32'

Test Environment

FROM golang:alpine
RUN apk add git
RUN go get -u github.com/peakgames/s5cmd
docker run -it --rm \
--entrypoint=time \
-v /home/ir/.aws/credentials:/root/.aws/credentials \
-v /tmp/one.txt:/tmp/one.txt \
$IMGNAME s5cmd --endpoint-url http://10.62.64.200:80 -uw 32 cp /tmp/one.txt s3://joshuarobinson/one.txt

Alternative Tools

S3cmd

FROM ubuntu:18.04
RUN apt-get update && apt-get install -y s3cmd --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
access_key = XXXXXXXX
proxy_host = $FB_DATAVIP
proxy_port = 80
secret_key = YYYYYYYYY

S4cmd

FROM ubuntu:18.04RUN apt-get update && apt-get install -y git python3 python3-pip python3-setuptools --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
ARG S4RELEASE=2.1.0
RUN git clone git://github.com/bloomreach/s4cmd.git \
&& cd s4cmd && git checkout tags/$S4RELEASE -b release \
&& pip3 install pytz boto3 && python3 setup.py install
docker run -it --rm -v /tmp:/tmp \
$IMGNAME s4cmd --endpoint-url http://10.62.64.200:80 --num-threads=128 put /tmp/one.txt s3://joshuarobinson/one.txt

Goofys

> export GOPATH=$HOME/work
> go get github.com/kahing/goofys
> go install github.com/kahing/goofys
> goofys --endpoint http://10.62.64.200 joshuarobinson /mountpoint/
> time cp /tmp/one.txt /mountpoint/one.txt
> rm /mountpoint/one.txt

Aws-cli

[default]
s3 =
max_concurrent_requests = 1000
max_queue_size = 10000
multipart_threshold = 64MB
multipart_chunksize = 16MB
FROM ubuntu:18.04
RUN apt-get update && apt-get install -y awscli --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
COPY config /root/.aws/config

Large Object Performance

upload failed: tmp/one.txt to s3://joshuarobinson/one.txt filedescriptor out of range in select()

Small Object Upload

> split -C 1M /mnt/joshua/one.txt prefix-
> s5cmd cp /tmp/prefix-* s3://joshuarobinson/tempdata/
ls /src/data-* | xargs -n1 -i -P 64 s3cmd -q put {} s3://bucketname/some/

Performance Comparison in AWS

Summary

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade