Three strategies for accessing Google Cloud Storage from PHP

Joor Loohuis
Oct 20 · 6 min read

Google Cloud Storage provides a flexible object storage that can be accessed from PHP in different ways. Each of these implementations has its benefits and downsides.

Photo by frank mckenna on Unsplash

Google Cloud Storage (GCS) is a service that provides cheap and massive object storage, where objects are simply chunks of data such as file contents. This makes GCS suitable for applications like backups, web hosting, media file storage, content delivery networks, data lakes, etc. Being able to access GCS from PHP applications and scripts creates a lot of possibilities for companies and organisations that rely on a PHP code base and knowledge.

I’m not going into the specifics of managing GCS itself. There is abundant information on the subject available from various sources, principally from Google itself. https://cloud.google.com/storage/ is a good starting point. Rather, I’ll describe a few strategies of implementing GCS in PHP, each with their specific benefits and downsides that have consequences for your application.

I’ll assume you have a Google Cloud Project at your disposal, with GCS enabled. To configure authentication, generate a key for a service account with sufficient permissions to GCS. This can be achieved in the ‘Service accounts’ page below the ‘IAM & admin’ menu item of the Google Cloud Console. In the ‘Actions’ menu of each service account you will find the ‘Create key’ option. Save the generated JSON file on the system that you will be accessing GCS from. Finally, store the full path to the key file in the environment variable GOOGLE_APPLICATION_CREDENTIALS. Make sure the file is readable for the user that is executing the code accessing GCS.

Strategy 1: mount a GCS bucket in the file system

The easiest and least invasive way of accessing GCS, is by mounting a bucket in the file system. Of course this approach is not specific to PHP, but language independent. On Linux and MacOS systems, Google offers a utility called gcsfuse, a FUSE adapter that implements GCS as a file system. FUSE (Filesystem in Userspace) allows you to mount a file system as a mortal user without elevated permissions. In its simplest form, execute the following from the commandline:

$ gcsfuse mybucket ~/bucket

This mounts the GCS bucket mybucket using the directory bucket in the home directory of the user executing the command, from where it can be accessed as a regular directory, including from PHP code. Of course both the bucket and the mount point need to exist beforehand. The bucket can also be mounted system-wide. To achieve that, add a line to /etc/fstab:

mybucket /mnt/bucket gcsfuse rw,allow_other,uid=1001,gid=1001

This will mount the bucket with ownership set to specified user and group, while still being accessible to other users. There also is a key_file mount option to specify the location of the authentication key file, which obviates the need for the GOOGLE_APPLICATION_CREDENTIALS environment variable.

There are some obvious benefits of using this strategy when accessing GCS:

  • This approach not only works for PHP code, but makes the bucket generally available to any application.
  • Existing code does not have to be modified, but merely reconfigured to start using GCS.

Of course there are some limitations that need to be considered:

  • The entire GCS file system is owned by the user and group that mount it, or are specified in the mount options. This may limit the applications.
  • The only way of switching buckets is by mounting all buckets that need to be accessed. In general, functionality is restricted to adding and deleting objects in a single bucket.
  • File systems mounted using gcsfuse are a lot slower than local file systems. This is particularly the case for writing small files. This may become problematic for applications that do many concurrect writes or reads, as web applications tend to do.
  • Your application is shielded from the GCS API, and error handling is dependent upon what FUSE reports in terms of I/O errors.
  • Since gcsfuse uses caching, buckets mounted using gcsfuse should in general not be modified by other applications, including gcsfuse mounts on other systems. Otherwize the file system may become inconsistent.
  • Although gcsfuse is maintained by Google itself, it is considered beta, and incompatibilities may be introduced at any time.

Full documentation for gcsfuse can be found in the gcsfuse github repository. For the sake of completeness, there are some third party tools around that provide the same functionality for Microsoft Windows systems, but I haven’t evaluated these.

Strategy 2: use PHP stream wrappers

Streams have been part of PHP for a long time now. They are a generalization of file, network and other data stream operations, and they use so-called wrappers to implement specific protocols and encodings. If you ever wondered why you could do a file_get_contents on a URL, wonder no more. HTTP is just another protocol implemented in a stream wrapper. The PHP package that Google provides for GCS access includes a stream wrapper that can be registered, allowing GCS access through regular PHP stream functions like fopen or file_get_contents.

To add the GCS PHP package to your application, use Composer:

$ composer require google/cloud-storage

If your application will also be accessing other Google Cloud services, you might consider installing all PHP packages for it into your application:

$ composer require google/cloud

Now register the stream wrapper in your bootstrapping code:

require 'vendor/autoload.php';
use Google\Cloud\Storage\StorageClient;
$storage = new StorageClient([
// optionally specify the keyfile path here
'
keyFilePath' = '/etc/auth/gcp/gcs-key.json',
]);
$storage->registerStreamWrapper();

This adds support for the gs protocol, which means that file I/O can be done using URLs of the form gs://mybucket/path/to/file. The bucket name is the first element of the path. I’ve configured the location of the GCS keyfile using a configuration option, which makes it unnecessary to declare the GOOGLE_APPLICATION_CREDENTIALS environment variable. The location in /etc is a personal preference, the key file may be placed anywhere in the file system.

The advantages of using stream wrappers are:

  • In a similar way to using gcsfuse, the changes to an existing code base are minimal. Besides registering the wrapper, only base paths need to be altered.
  • All buckets that the GCS service account has access to are available through the wrapper by means of the path.
  • The PHP module for Google Cloud support are provided by Google itself, and considered stable and generally available. This means no incompatibilities will be introduced in minor or patch releases, and issues will be addressed with priority.

Of course there also are drawbacks:

  • I/O using stream wrappers that implement network protocols in general is significantly slower than I/O to local file systems.
  • Like with gcsfuse, error handling is limited to I/O errors, and does not give insight into what actually occurs in the GCS API.
  • Using stream wrappers is limited to PHP, so if you require access to GCS from outside PHP, you need to implement that too.

Strategy 3: use the GCS API

For optimal control when interacting with GCS, support for the full GCS API should be implemented. Install the google/cloud-storage package or the entire google/cloud package using Composer as described earlier, and make sure the classes are loaded in the bootstrapping procedure of your application. Instantiate a client as is done in the example above, and you’re set. Since the client implements the complete API, there are no limits to what you can achieve. Useful examples can be found in the Cloud Storage section of the Google Cloud Project PHP docs github repository. The Google Cloud Storage PHP API documentation is also very useful.

The benefits of implementing an GCS API client are:

  • The client provides access to the full API, giving you the best control over buckets and objects, including access control. It goes well beyond treating GCS as file storage, to a generalized object storage.
  • Often overlooked but also very important are the detailed responses that API calls get, which makes advanced exception handling possible.
  • As said before, the code for the package providing the API client is provided by Google itself, and is considered stable and generally available.

The obvious downside is the cost of implementing GCS support into an application. Object management is different from file I/O, so retrofitting an existing code base may be invasive. How much depends on the case at hand. Implementing the API in a new project may prove more involved than doing file I/O, simply because the whole point is that you typically won’t be implementing just file I/O.

In a nutshell

I’ve described three strategies for storing and retrieving data from Google Cloud Storage (GCS) using PHP:

  1. Use gcsfuse if you need quick and easy access to a storage bucket, but be aware that your storage may become inconsistent if other processes modify the same bucket. Also keep a close eye on system I/O.
  2. Use GCS stream wrappers for easy and noninvasive file storage in all accessible buckets. The code for this is stable and well-supported, but it still restricts you to doing file I/O.
  3. Use the GCS API if you need refined control over buckets and objects, including exception handling, at the cost of more time spent on implementation and testing.

Joor Loohuis

Written by

Lead software achitect at fonQ, a fairly large e-commerce enterprise.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade