S5cmd for High Performance Object Storage

Joshua Robinson
Aug 5 · 8 min read

Why Use S5cmd

Installation and Usage:

aws_access_key_id = XXXXXXX
aws_secret_access_key = YYYYYYYY
go get -u github.com/peakgames/s5cmd
> s5cmd --endpoint-url ls s3://joshuarobinson/
+ DIR backup/
+ 2017/10/13 10:31:29 73 people.json
+ 2019/07/10 12:39:43 53687091200 two.txt
2019/08/02 03:08:13 +OK “ls s3://joshuarobinson” (13)
> s5cmd --endpoint-url -uw 64 cp /source s3://joshuarobinson/dest
alias s5cmd='s5cmd --endpoint-url -dw 32 -uw 32'

Test Environment

FROM golang:alpine
RUN apk add git
RUN go get -u github.com/peakgames/s5cmd
docker run -it --rm \
--entrypoint=time \
-v /home/ir/.aws/credentials:/root/.aws/credentials \
-v /tmp/one.txt:/tmp/one.txt \
$IMGNAME s5cmd --endpoint-url -uw 32 cp /tmp/one.txt s3://joshuarobinson/one.txt

Alternative Tools


FROM ubuntu:18.04
RUN apt-get update && apt-get install -y s3cmd --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
access_key = XXXXXXXX
proxy_host = $FB_DATAVIP
proxy_port = 80
secret_key = YYYYYYYYY


FROM ubuntu:18.04RUN apt-get update && apt-get install -y git python3 python3-pip python3-setuptools --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
RUN git clone git://github.com/bloomreach/s4cmd.git \
&& cd s4cmd && git checkout tags/$S4RELEASE -b release \
&& pip3 install pytz boto3 && python3 setup.py install
docker run -it --rm -v /tmp:/tmp \
$IMGNAME s4cmd --endpoint-url --num-threads=128 put /tmp/one.txt s3://joshuarobinson/one.txt


> export GOPATH=$HOME/work
> go get github.com/kahing/goofys
> go install github.com/kahing/goofys
> goofys --endpoint joshuarobinson /mountpoint/
> time cp /tmp/one.txt /mountpoint/one.txt
> rm /mountpoint/one.txt


s3 =
max_concurrent_requests = 1000
max_queue_size = 10000
multipart_threshold = 64MB
multipart_chunksize = 16MB
FROM ubuntu:18.04
RUN apt-get update && apt-get install -y awscli --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
COPY config /root/.aws/config

Large Object Performance

upload failed: tmp/one.txt to s3://joshuarobinson/one.txt filedescriptor out of range in select()

Small Object Upload

> split -C 1M /mnt/joshua/one.txt prefix-
> s5cmd cp /tmp/prefix-* s3://joshuarobinson/tempdata/
ls /src/data-* | xargs -n1 -i -P 64 s3cmd -q put {} s3://bucketname/some/

Performance Comparison in AWS


