the gsutil command

suhaib ahmed
3 min readJan 26, 2019

--

Google cloud storage cannot be fully realized without this command. Here is a tutorial on what you can do with this command

Google Cloud Storage (GCS)

Before we see what the command can do let’s talk about GCS. It is an object storage service built on Google's massive planet-wide infrastructure that has enabled the world to search the internet. Very much like AWS S3, it can be used for a variety of use cases like media content storage and delivery, a repository for analytics and machine learning and backing up your data.

  • Integrate storage into your apps with a single unified API
  • Optimize price/performance across four storage classes with Object Lifecycle Management
  • Access data instantly from any storage class
  • Designed for secure and durable storage
  • Reduce data storage carbon emissions to zero

Setup gsutil

You need to do a little bit of setup in order to run gsutil from your command line/terminal.

  1. Download and install the sdk
curl https://sdk.cloud.google.com | bash

2 .Restart your shell

exec -l $SHELL

3. Run gcloud init to initialize the gcloud environment

gcloud init

Getting started with gsutil

the gsutil tool is used to manage buckets and objects on GCS. You can check out the code for gsutil on it’s Github page. You can use gsutil to do a wide range of tasks like :

  • Creating and deleting buckets.
  • Uploading, downloading, and deleting objects.
  • Listing buckets and objects.
  • Moving, copying, and renaming objects.
  • Editing object and bucket ACLs.

Before you start doing these things, command get help for gsutil or any subcommands using :

gsutil help
gsutil <command> help

Bucket Management

Start off by creating a bucket. Buckets are the basic containers that hold your data. Everything that you store in Cloud Storage must be contained in a bucket. Bucket names share a single global namespace and must be unique.

gsutil mb gs://<bucketname>

Set the following flags for fine-grained control :

  • -p: specify the project with which your bucket will be associated.
  • -c: specify the default storage class of your bucket.
  • -l: specify the location of your bucket.

You can list out all available buckets using

gsutil ls

get bucket information like location and storage class using

gsutil ls -L -b gs://<bucketname>/

Change the default storage class using defstorageclass subcommand.

gsutil defstorageclass set <storageclass> gs://<bucketname>

Moving or Renaming a bucket

Bucket name, location, and project are permanently defined, however you can effectively rename and change location or project id a bucket by creating a new bucket with the desired name, moving all data from old bucket to the new bucket and finally removing the old bucket. 

File/Object Management

Try uploading and downloading a file from the bucket with the cp subcommand. Use -r flag to carry out the operations for folders.

gsutil cp <localobjectlocation> gs://<bucketname>/
gsutil cp -r <folderlocation> gs://<bucketname>/

List out the contents of the bucket using. You can also use a prefix to limit search results.

gsutil ls -r gs://<bucketname>/**

Download objects from your bucket. Use -r for folders.

Pro tip: Always use the -m flag to perform the operation parallelly. This can improve performance greatly for large number of files.

gsutil cp gs://<bucketname>/<objectname> <objectdestination>

gsutil uses the mv subcommand to rename objects.

gsutil mv gs://<bucketname>/<oldobjectname> gs://<bucketname>/<newobjectname>

copy objects from one bucket or folder to another bucket or folder using

gsutil cp gs://<sourcebucketname>/<sourceobjectname> gs://<destinationbucketname>/<destinationsourcename>

you might want to change the storage classes of your objects. Know more about storage classes at here.

gsutil rewrite -s <storageclass> gs://<pathtoobject>

You can view and edit object metadata using

gsutil ls -L  gs://[BUCKET_NAME]/[OBJECT_NAME]
gsutil setmeta -h "<metadatakey>:<metadatavalue>" gs://<bucketname>/<objectname>

Delete objects from your buckets

gsutil rm gs://<bucketname>/<objectname>

The rsync command synchronizes contects of two buckets/directories

gsutil rsync -d -m /data gs://<bucketname>

To get information on amount of storage used by objects

gsutil du gs://<bucketname>
gsutil du gs://<bucketname>/prefix/*

Controlling Access

It is important to strategize the access control of your objects. gsutil allows easy and convenient way to control access of your objects.

To make objects publicly readable

gsutil acl ch -u AllUsers:R gs://<bucketname>/<objectname>

To make groups of objects publicly readable

gsutil iam ch allUsers:objectViewer gs://<bucketname>

Summary

The gsutil command is your bread and butter when automating your Google cloud storage operations. The above commands will help you get your bucket up and running.

You can use GCS in conjunction with a variety of other services on GCP or another service provider.

--

--