StorX Integration Update: Backup using the S3cmd

Published in

StorX Network

14 min readJul 15, 2024

What is S3cmd?

Ans:--S3cmd is a free command-line tool and client for uploading, retrieving, and managing data in Amazon S3 and other cloud storage service providers that use the S3 protocol, such as Google Cloud Storage or DreamHost DreamObjects. It is best suited for power users familiar with command-line programs. It is also ideal for batch scripts and automated backups to S3 triggered from cron, etc.

1. Installation of s3cmd

Download S3cmd From SourceForge
Download From SourceForge - Latest released version is 2.4.0
S3cmd requires Python 2.6 (or newer). S3cmd version 2.x is also compatible with Python 3.x
See the INSTALL file contained in the download for installation instructions.
Copy and paste the following in the terminal to proceed with the installation of the S3cmd:-- Sudo apt install s3cmd

2. Check the pre-requisite for the setup when we follow the following command in the terminal and see the output.

Enter the following in the terminal:-- “s3cmd -h”

Usage: s3cmd [options] COMMAND [parameters]

S3cmd is a tool for managing objects in Amazon S3 storage. It allows for making and removing "buckets" and uploading, downloading and removing "objects" from these buckets.

Options:

h, --help show this help message and exit
--configure Invoke interactive (re)configuration tool. Optionally use as '--configure s3://some-bucket' to test access to a specific bucket instead of attempting to list them all.
-c FILE, --config=FILE

Config file name. Defaults to $HOME/.s3cfg

--dump-config Dump current configuration after parsing config files, command line options, and exit.

--access_key=ACCESS_KEY
AWS Access Key

--secret_key=SECRET_KEY
AWS Secret Key

--access_token=ACCESS_TOKEN
AWS Access Token

-n, --dry-run Only show what should be uploaded or downloaded, but don't do it. May still perform S3 requests to get bucket listings and other information, though (only for file transfer commands)

-s, --ssl Use HTTPS connection when communicating with S3.
(default)

--no-ssl Don't use HTTPS.

-e, --encrypt Encrypt files before uploading to S3.

--no-encrypt Don't encrypt files.

-f, --force Force overwrite and other dangerous operations.

--continue Continue getting a partially downloaded file (only for [get] command).
--continue-put Continue uploading partially uploaded files or multipart upload parts. Restarts parts/files that don't have matching size and md5. Skips files/parts that do. Note: md5sum checks are not always sufficient to check (part) file equality. Enable this at your own risk.

--upload-id=UPLOAD_ID
UploadId for Multipart Upload, in case you want to continue an existing upload (equivalent to --continue- put), and there are multiple partial uploads. Use s3cmd multipart [URI] to see what UploadIds are associated with the given URI.

--skip-existing Skip over files that exist at the destination (only for [get] and [sync] commands).

-r, --recursive Recursive upload, download or removal.

--check-md5 Check MD5 sums when comparing files for [sync].
(default)

--no-check-md5 Do not check MD5 sums when comparing files for [sync].

Only size will be compared. It may significantly speed up transfer but may also miss some changed files.

-P, --acl-public Store objects with ACL allow anyone to read them

--acl-private Store objects with default ACL only allow you access.

--acl-grant=PERMISSION:EMAIL or USER_CANONICAL_ID
Grant stated permission to a given Amazon user. Permission is one of: read, write, read_acp, write_acp, full_control, all

--acl-revoke=PERMISSION:USER_CANONICAL_ID
Revoke stated permission for a given amazon user. Permission is one of: read, write, read_acp, write_acp, full_control, all

-D NUM, --restore-days=NUM
Number of days to keep restored file available (only for 'restore' command). Default is 1 day.

--restore-priority=RESTORE_PRIORITY
Priority for restoring files from S3 Glacier (only for 'restore' command). Choices available: bulk, standard, expedited.

--delete-removed Delete destination objects with no corresponding source file [sync]

--no-delete-removed Don't delete destination objects [sync]

--delete-after Perform deletes AFTER new uploads when delete-removed is enabled [sync]

--delay-updates *OBSOLETE* Put all updated files into place at end [sync]

--max-delete=NUM Do not delete more than NUM files. [del] and [sync]

--limit=NUM Limit number of objects returned in the response body (only for [ls] and [la] commands)

--add-destination=ADDITIONAL_DESTINATIONS
Additional destination for parallel uploads, in addition to the last arg. May be repeated.

--delete-after-fetch Delete remote objects after fetching to a local file (only for [get] and [sync] commands).

-p, --preserve Preserve filesystem attributes (mode, ownership, timestamps). Default for [sync] command.

--no-preserve Don't store FS attributes

--keep-dirs Preserve all local directories as remote objects including empty directories. Experimental feature.

--exclude=GLOB Filenames and paths matching GLOB will be excluded from sync

--exclude-from=FILE Read --exclude GLOBs from FILE

--rexclude=REGEXP Filenames and paths matching REGEXP (regular expression) will be excluded from sync

--rexclude-from=FILE Read --rexclude REGEXPs from FILE

--include=GLOB Filenames and paths matching GLOB will be included
even if previously excluded by one of
--(r)exclude(-from) patterns

--include-from=FILE Read --include GLOBs from FILE

--rinclude=REGEXP Same as --include but uses REGEXP (regular expression) instead of GLOB

--rinclude-from=FILE Read --rinclude REGEXPs from FILE

--files-from=FILE Read list of source-file names from FILE. Use - to read from stdin.

--region=REGION, --bucket-location=REGION
Region to create bucket in. As of now, the regions are:
us-east-1, us-west-1, us-west-2, eu-west-1, eu- central-1, ap-northeast-1, ap-southeast-1, ap-southeast-2, sa-east-1

--host=HOSTNAME HOSTNAME:PORT for S3 endpoint (default: s3.amazonaws.com, alternatives such as s3-eu-west-1.amazonaws.com). You should also set --host-bucket.

--host-bucket=HOST_BUCKET
DNS-style bucket+hostname: port template for accessing a bucket (default: %(bucket)s.s3.amazonaws.com)

--reduced-redundancy, --rr
Store object with 'Reduced redundancy'. Lower per-GB price. [put, cp, mv]

--no-reduced-redundancy, --no-rr
Store object without 'Reduced redundancy'. Higher per- GB price. [put, cp, mv]

--storage-class=CLASS
Store object with specified CLASS (STANDARD, STANDARD_IA, ONEZONE_IA, INTELLIGENT_TIERING, GLACIER, or DEEP_ARCHIVE). [put, cp, mv]

--access-logging-target-prefix=LOG_TARGET_PREFIX
Target prefix for access logs (S3 URI) (for [cfmodify] and [accesslog] commands)

--no-access-logging Disable access logging (for [cfmodify] and [accesslog] commands)

--default-mime-type=DEFAULT_MIME_TYPE
Default MIME-type for stored objects. The application default is binary/octet-stream.

-M, --guess-mime-type
Guess MIME-type of files by their extension or mime magic. Fall back to default MIME-Type as specified by --default-mime-type option

--no-guess-mime-type Don't guess MIME-type and use the default type instead.

--no-mime-magic Don't use mime magic when guessing MIME-type.

-m MIME/TYPE, --mime-type=MIME/TYPE
Force MIME-type. Override both --default-mime-type and --guess-mime-type.

--add-header=NAME:VALUE
Add a given HTTP header to the upload request. It can be used multiple times. For instance, use this option to set 'Expires' or 'Cache-Control' headers (or both).

--remove-header=NAME Remove a given HTTP header. It can be used multiple times. For instance, remove 'Expires' or 'Cache- Control' headers (or both) using this option. [modify]

--server-side-encryption
Specifies that server-side encryption will be used when putting objects. [put, sync, cp, modify]

--server-side-encryption-kms-id=KMS_KEY
Specifies the key ID used for server-side encryption with AWS KMS-Managed Keys (SSE-KMS) when putting objects. [put, sync, cp, modify]

--encoding=ENCODING Override autodetected terminal and filesystem encoding (character set). Autodetected: UTF-8

--add-encoding-exts=EXTENSIONs
Add encoding to these comma-delimited extensions i.e. (css,js,html) when uploading to S3 )

--verbatim: Use the S3 name as given on the command line. No pre-processing, encoding, etc. Use with caution!

--disable-multipart: Disable multipart upload on files bigger than--multipart-chunk-size-mb

--multipart-chunk-size-mb=SIZE:
Size of each chunk of a multipart upload. Files bigger than SIZE are automatically uploaded as multithreaded- multipart, smaller files are uploaded using the traditional method. SIZE is in Mega-Bytes, default chunk size is 15MB, minimum allowed chunk size is 5MB, maximum is 5GB.

--list-md5 Include MD5 sums in bucket listings (only for 'ls' command).

--list-allow-unordered
Not an AWS standard. Allow the listing results to be returned in unsorted order. This may be faster when listing very large buckets.

-H, --human-readable-sizes:
Print sizes in human readable form (eg 1kB instead of 1234).

--ws-index=WEBSITE_INDEX:
Name of index-document (only for [ws-create] command)

--ws-error=WEBSITE_ERROR:
Name of error-document (only for [ws-create] command)

--expiry-date=EXPIRY_DATE:
Indicates when the expiration rule takes effect. (only for [expire] command)

--expiry-days=EXPIRY_DAYS
Indicates the number of days after object creation the expiration rule takes effect. (only for [expire] command).

--expiry-prefix=EXPIRY_PREFIX:
Identifying one or more objects with the prefix to which the expiration rule applies. (only for [expire] command)

--skip-destination-validation:
Skips validation of Amazon SQS, Amazon SNS, and AWS Lambda destinations when applying notification configuration. (only for [setnotification] command)

--progress Display progress meter (default on TTY).

--no-progress Don't display progress meter (default on non-TTY).

--stats Give some file-transfer stats.

--enable Enable given CloudFront distribution (only for [cfmodify] command)

--disable Disable given CloudFront distribution (only for [cfmodify] command)

--cf-invalidate Invalidate the uploaded filed in CloudFront. Also see [cfinval] command.

--cf-invalidate-default-index
When using Custom Origin and S3 static website, invalidate the default index file.

--cf-no-invalidate-default-index-root
When using Custom Origin and S3 static website, don't invalidate the path to the default index file.

--cf-add-cname=CNAME Add given CNAME to a CloudFront distribution (only for [cfcreate] and [cfmodify] commands)

--cf-remove-cname=CNAME
Remove given CNAME from a CloudFront distribution (only for [cfmodify] command)

--cf-comment=COMMENT Set COMMENT for a given CloudFront distribution (only for [cfcreate] and [cfmodify] commands)

--cf-default-root-object=DEFAULT_ROOT_OBJECT
Set the default root object to return when no object is specified in the URL. Use a relative path, i.e. default/index.html instead of /default/index.html or s3://bucket/default/index.html (only for [cfcreate] and [cfmodify] commands)

-v, --verbose Enable verbose output.
-d, --debug Enable debug output.
--version Show s3cmd version (2.4.0) and exit.
-F, --follow-symlinks

Follow symbolic links as if they are regular files

--cache-file=FILE Cache FILE containing local source MD5 values
-q, --quiet Silence output on stdout
--ca-certs=CA_CERTS_FILE

Path to SSL CA certificate FILE (instead of system default)
--ssl-cert=SSL_CLIENT_CERT_FILE

Path to client own SSL certificate CRT_FILE
--ssl-key=SSL_CLIENT_KEY_FILE

Path to client own SSL certificate private key
KEY_FILE

— check-certificate: Check SSL certificate validity

--no-check-certificate: Do not check SSL certificate validity

--check-hostname: Check SSL certificate hostname validity

--no-check-hostname: Do not check SSL certificate hostname validity

--signature-v2: Use AWS Signature version 2 instead of newer signature
methods. It is helpful for S3-like systems that still need AWS Signature v4.

--limit-rate=LIMITRATE: Limit the upload or download speed to amount bytes per second. Amount may be expressed in bytes, kilobytes with the k suffix, or megabytes with the m suffix.

--no-connection-pooling: Disable connection reuse

--requester-pays Set the REQUESTER PAYS flag for operations.

-l, --long-listing Produce long listing [ls]

--stop-on-error stop if error in transfer --max-retries=NUM Maximum number of times to retry a failed request before giving up. Default is 5

--content-disposition=CONTENT_DISPOSITION: Provide a Content-Disposition for signed URLs, e.g., "inline; filename=myvideo.mp4"

--content-type=CONTENT_TYPE: Provide a Content-Type for signed URLs, e.g., "video/mp4"

Commands:

Make bucket: s3cmd mb s3://BUCKET

Remove bucket: s3cmd rb s3://BUCKET

List objects or buckets: s3cmd ls [s3://BUCKET[/PREFIX]]

List all objects in all buckets: s3cmd la

Put the file into the bucket: s3cmd put FILE [FILE...] s3://BUCKET[/PREFIX]

Get file from bucket: s3cmd get s3://BUCKET/OBJECT LOCAL_FILE

Delete file from bucket: s3cmd del s3://BUCKET/OBJECT

Delete file from bucket (alias for del): s3cmd rm s3://BUCKET/OBJECT

Restore file from Glacier storage: s3cmd restore s3://BUCKET/OBJECT

Synchronize a directory tree to S3 (checks file freshness using size and md5 checksum, unless overridden by options; see below)
s3cmd sync LOCAL_DIR s3://BUCKET[/PREFIX] or s3://BUCKET[/PREFIX] LOCAL_DIR or s3://BUCKET[/PREFIX] s3://BUCKET[/PREFIX]

Disk usage by buckets:
s3cmd du [s3://BUCKET[/PREFIX]]

Get various information about Buckets or Files:
s3cmd info s3://BUCKET[/OBJECT]

Copy object:
s3cmd cp s3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]

Modify object metadata:
s3cmd modify s3://BUCKET1/OBJECT

Move object:
s3cmd mv s3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]

Modify Access control list for Bucket or Files:
s3cmd setacl s3://BUCKET[/OBJECT]

Modify Bucket Versioning:
s3cmd setversioning s3://BUCKET enable|disable

Modify Bucket Object Ownership:
s3cmd setownership s3://BUCKET BucketOwnerPreferred|BucketOwnerEnforced|ObjectWriter

Modify Block Public Access rules:
s3cmd setblockpublicaccess s3://BUCKET BlockPublicAcls,IgnorePublicAcls,BlockPublicPolicy,RestrictPublicBuckets

Modify Object Legal Hold:
s3cmd setobjectlegalhold STATUS s3://BUCKET/OBJECT

Modify Object Retention:
s3cmd setobjectretention MODE RETAIN_UNTIL_DATE s3://BUCKET/OBJECT

Modify Bucket Policy:
s3cmd setpolicy FILE s3://BUCKET

Delete Bucket Policy:
s3cmd delpolicy s3://BUCKET

Modify Bucket CORS:
s3cmd setcors FILE s3://BUCKET

Delete Bucket CORS:
s3cmd delcors s3://BUCKET

Modify Bucket Requester Pays policy:
s3cmd payer s3://BUCKET

Show multipart uploads:
s3cmd multipart s3://BUCKET [Id]

Abort a multipart upload:
s3cmd abortmp s3://BUCKET/OBJECT Id

List parts of a multipart upload:
s3cmd listmp s3://BUCKET/OBJECT Id

Enable/disable bucket access logging:
s3cmd accesslog s3://BUCKET

Sign arbitrary string using the secret key:
s3cmd sign STRING-TO-SIGN

Sign an S3 URL to provide limited public access with an expiry:
s3cmd signurl s3://BUCKET/OBJECT <expiry_epoch|+expiry_offset>

Fix invalid file names in a bucket:
s3cmd fixbucket s3://BUCKET[/PREFIX]

Modify tagging for Bucket or Files:
s3cmd settagging s3://BUCKET[/OBJECT] "KEY=VALUE[&KEY=VALUE ...]"

Get tagging for Bucket or Files:
s3cmd gettagging s3://BUCKET[/OBJECT]

Delete tagging for Bucket or Files:
s3cmd deltagging s3://BUCKET[/OBJECT]

Create a Website from bucket:
s3cmd ws-create s3://BUCKET

Delete Website:
s3cmd ws-delete s3://BUCKET

Info about Website:
s3cmd ws-info s3://BUCKET

Set or delete expiration rule for the bucket:
s3cmd expire s3://BUCKET

Upload a lifecycle policy for the bucket:
s3cmd setlifecycle FILE s3://BUCKET

Get a lifecycle policy for the bucket:
s3cmd getlifecycle s3://BUCKET

Remove a lifecycle policy for the bucket:
s3cmd dellifecycle s3://BUCKET

Upload a notification policy for the bucket:
s3cmd setnotification FILE s3://BUCKET

Get a notification policy for the bucket:
s3cmd getnotification s3://BUCKET

Remove a notification policy for the bucket:
s3cmd delnotification s3://BUCKET

List CloudFront distribution points:
s3cmd cflist

Display CloudFront distribution point parameters:
s3cmd cfinfo [cf://DIST_ID]

Create CloudFront distribution point:
s3cmd cfcreate s3://BUCKET

Delete CloudFront distribution point:
s3cmd cfdelete cf://DIST_ID

Change CloudFront distribution point parameters:
s3cmd cfmodify cf://DIST_ID

Invalidate CloudFront objects:
s3cmd cfinval s3://BUCKET/OBJECT [s3://BUCKET/OBJECT ...]

Display CloudFront invalidation request(s) status:
s3cmd cfinvalinfo cf://DIST_ID[/INVAL_ID]

For more information, updates, and news, visit the s3cmd website:
http://s3tools.org

3. Generate the access key ID, secret key, and endpoints from StorX.

How to obtain an Amazon S3 access key pair.

Once you login you will reach to this dashboard In this dashboard click on user icon at the top.

When you click on user icon it will open dropdown list From that list click on access.

When you click on the access list, it will open this screen, where you can create an access token.

When you click on the Create button, this popup will open.
Add a name to your access key and click the “Continue” button.

Once you confirm all the details and click on confirm it will open this popup with all required tokens.
To copy All keys you can click on “copy” button.
To download All keys you can click on “Download” button.
If you click on Show Key it will show key for that block.

These are the example values for how access key, secret key and Endpoint will have values.

How to perform s3cmd configuration?

Firstly open the downloaded document containing following:

Access Key:

Jugsodt6mo4qavp7trm415iogk5q #Replace this with the your access key generated.

Secret Key: J2wocdaioy4umzgpfqnl5sz4i4nkqr4lx3gbogm3zyxkplefaq4i4 #Replace this with the secret key generated.

Endpoint: https://gateway.storx.io #Replace this with the generated endpoint.

s3cmd --configure

Enter new values or accept defaults in brackets with Enter.
Refer to the user manual for a detailed description of all options.

The access key and Secret key are your identifiers for Amazon S3. Leave them empty to use the env variables.

Access Key: [jugsodt6mo4qavp7trm425iogk5q]
Secret Key: [j2wocdaioy4umzgpfqnl5sz4i4nkqr4lx3gbogm2zyxkplefaq4i4].
Default Region [US].
Use “s3.amazonaws.com” for the S3 Endpoint, and do not modify it to the target Amazon S3.
S3 Endpoint [gateway.storx.io].

Use “%(bucket)s.s3.amazonaws.com” to the target Amazon S3. “%(bucket)s” and “%(location)s” vars can be used

if the target S3 system supports DNS based buckets.

DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.gateway.storx.io]: %(rest)s.gateway.storx.io

Encryption password is used to protect your files from reading by unauthorized persons while in transfer to S3

Encryption password [1]:
Path to GPG program [/usr/bin/gpg]:
When using secure HTTPS protocol all communication with Amazon S3 servers is protected from 3rd party eavesdropping. This method is slower than plain HTTP, and can only be proxied with Python 2.7 or newer.

Use HTTPS protocol [Yes]:
On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can’t connect to S3 directly.

HTTP Proxy server name:

New settings:
Access Key: jugsodt6mo4qavp7trm425iogk5q
Secret Key: j2wocdaioy4umzgpfqnl5sz4i4nkqr4lx3gbogm2zyxkplefaq4i4
Default Region: US
S3 Endpoint: gateway.storx.io

DNS-style bucket+hostname:port template for accessing a bucket: %(rest)s.gateway.storx.io
Encryption password: 1
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: True
HTTP Proxy server name:
HTTP Proxy server port: 0

Test access with supplied credentials? [Y/n] y
Please wait, attempting to list all buckets…
Success. Your access key and secret key worked fine :-)

Now verifying that encryption works…
Success. Encryption and decryption worked fine :-)

Save settings? [y/N] y
Configuration saved to ‘/home/crystal-harmony/.s3cfg’

Provide the access key and secret key and s3 endpoints :-

Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: jugsodt6mo4qavp7trm425iogk5q
Secret Key: j2wocdaioy4umzgpfqnl5sz4i4nkqr4lx3gbogm2zyxkplefaq4i4
S3 Endpoint: gateway.storx.io

Leave the following fields empty but provide the values for the follwing accordingly.

Default Region: US
DNS-style bucket+hostname:port template for accessing a bucket: %(rest)s.gateway.storx.io
Encryption password: 1 — — — — — — — — →>You can encrypt this if needed
Path to GPG program: /usr/bin/gpg
Use HTTPS protocol: True
HTTP Proxy server name:
HTTP Proxy server port: 0

Once you are ready with the set up you can perform the following action to deep dive into this.
-> List the vault: — S3cmd ls

2024–07–11 04:49 s3://demo-vault
2024–07–11 11:10 s3://rest

-> Make the bucket: — s3cmd mb s3://my-new-bucket-name

s3cmd mb s3://test
Output: — Bucket ‘s3://test/’ created

-> Upload a file into the bucket ~$ s3cmd put path/to/folder s3://bucket-name/file-name

s3cmd put /home/crystal-harmony/Downloads/images.jpeg s3://test/images.jpeg
Output: — upload: ‘/home/crystal-harmony/Downloads/images.jpeg’ -> ‘s3://test/images.jpeg’ [1 of 1] — ->>9834 of 9834 100% in 5s 1674.00 B/s done