Getting Started w/ Java on GCP

The “Missing Tutorials” series

Daz Wilkin
Google Cloud - Community
9 min readSep 21, 2017

--

Writing a short series of ‘getting started’ posts for those of you, like me, who may get to the point of wanting to write code against a Google service, having a language chosen but then, having not written code for a week or two, I’m stalled by “How exactly do I get started?”

Setup

I’m running (Linux), Java (openjdk-8-jdk), Apache Maven (3.5.0).

I’m going to use Maven as it’s the Java built tool that I’m most familiar with. Google provides tool integration with Maven and Gradle with, for example, App Engine. You may, of course, manage the packages manually too.

Let’s set up our environment:

PROJECT_ID=[[YOUR-PROJECT-ID]]
LANG=java
mkdir -p ${HOME}/${PROJECT_ID}/${LANG}
cd ${HOME}/${PROJECT_ID}/${LANG}

Google Java Libraries

In the foundational post “Starting w/ Google Cloud Platform APIs”, I summarized the existence of 2 Google-provided libraries:

API Client Libraries

The API Client Libraries are decomposed into a core package (google-api-client) and then service-specific packages. We’ll use Google Cloud Storage (GCS) in this post and so we’ll need to reference the GCS JSON API package (google-api-services-storage) too.

Google

https://developers.google.com/api-client-library/java/
https://developers.google.com/api-client-library/java/apis/

Maven

https://mvnrepository.com/artifact/com.google.api-client/google-api-client
https://mvnrepository.com/artifact/com.google.apis/google-api-services-storage

The current library is 1.22.0 and mvnrepository generates the XML for the pom.xml file:

<dependency>
<groupId>com.google.api-client</groupId>
<artifactId>google-api-client</artifactId>
<version>1.22.0</version>
</dependency>
<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-storage</artifactId>
<version>v1-rev111-1.18.0-rc</version>
</dependency>

GitHub

https://github.com/google/google-api-java-client

Cloud Client Libraries

NB Cloud Client Libraries are released on a cycle that lags the underlying service. A service may be GA but the Cloud Client Library *may* be Alpha.

Google

https://cloud.google.com/apis/docs/cloud-client-libraries

Maven

https://mvnrepository.com/artifact/com.google.cloud/google-cloud-core

The current core library is 1.6.0 and the Google Cloud Storage library is 1.6.0 too:

<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-core</artifactId>
<version>1.6.0</version>
</dependency>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-storage</artifactId>
<version>1.6.0</version>
</dependency>

GitHub

https://googlecloudplatform.github.io/google-cloud-java/0.24.0/index.html
https://github.com/GoogleCloudPlatform/google-cloud-java

API documentation:

https://googlecloudplatform.github.io/google-cloud-java/0.24.0/apidocs/index.html

Alright, let’s write some code.

Google Cloud Storage

We’ll use Google Cloud Storage for this example.

You may use any of Google’s services with the API Client Libraries because all of them are supported by the Libraries.

Many of Google’s service are supported with Cloud Client Libraries although these are released on a different cadence to the underlying services. You can see the list here and also whether the Library is Alpha, Beta or GA.

Pick an arbitrary (small) file that you don’t mind replicating several times on a GCS bucket and:

BUCKET=$(whoami)-$(date +%y%m%d%H%M)
FILE=[[/Path/To/Your/File]]
gsutil mb -p ${PROJECT_ID} gs://${BUCKET}
Creating gs://${BUCKET}/...
for i in $(seq -f "%02g" 1 10)
do
gsutil cp $FILE gs://${BUCKET}/${i}
done
gsutil ls gs://${BUCKET}
gs://${BUCKET}/01
gs://${BUCKET}/02
gs://${BUCKET}/03
gs://${BUCKET}/04
gs://${BUCKET}/05
gs://${BUCKET}/06
gs://${BUCKET}/07
gs://${BUCKET}/08
gs://${BUCKET}/09
gs://${BUCKET}/10

Application Default Credentials (ADCs)

ADCs is a simple way to get auth credentials when using Google APIs with the added benefit that code is portable from your local workstation to App Engine and Compute Engine. If you can, always use ADCs.

https://developers.google.com/identity/protocols/application-default-credentials
https://developers.google.com/identity/protocols/application-default-credentials#callingjava

As with Google’s other Java assets, ADCs are available from mvnrepository:

https://mvnrepository.com/artifact/com.google.apis/google-api-services-oauth2

You’ll need to reference ADCs as a dependency in your Maven pom.xml:

<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-oauth2</artifactId>
<version>v2-rev129-1.22.0</version>
</dependency>

Solution #1: Using API Client Libraries

In truth, customarily, I manually create the directories for Java projects but I Googled the correct way to do this using Maven:

mvn archetype:generate \
--batch-mode \
--define archetypeGroupId=org.apache.maven.archetypes \
--define groupId=com.google.dazwilkin \
--define artifactId=api

This should generate a directory called “api” containing a default pom.xml and a “src” directory tree contain template Java files.

Using your preferred Editor (or IDE), open pom.xml:

<project ...>
<modelVersion>4.0.0</modelVersion>
<groupId>com.google.dazwilkin</groupId>
<artifactId>api</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<name>api</name>
<url>http://maven.apache.org</url>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
</project>

Add the API Client Library dependency and the dependency so that you may use Application Default Credentials. Save the file. The result should resemble:

<project>
<modelVersion>4.0.0</modelVersion>
<groupId>com.google.dazwilkin</groupId>
<artifactId>api</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<name>api</name>
<url>http://maven.apache.org</url>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- Google API Client Library -->
<dependency>
<groupId>com.google.api-client</groupId>
<artifactId>google-api-client</artifactId>
<version>1.22.0</version>
</dependency>
<!-- API Client Library: Storage -->
<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-storage</artifactId>
<version>v1-rev111-1.18.0-rc</version>
</dependency>
<!-- Application Default Credentials -->
<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-oauth2</artifactId>
<version>v2-rev129-1.22.0</version>
</dependency>
</dependencies>
</project>

Then open the generated App.java:

package com.google.dazwilkin;public class App 
{
public static void main( String[] args )
{
System.out.println( "Hello World!" );
}
}

Optional: I recommend renaming App.java to CloudStorage.java and rename the class App to CloudStorage too.

Let’s do this in piece-by-piece.

There’s some boilerplate code that we’ll copy from the Application Default Credentials Java sample and, while Google doesn’t explicitly document this, you can see the consistent samples that instantiate a Storage service:

private static HttpTransport httpTransport;private static final JsonFactory JSON_FACTORY = JacksonFactory.getDefaultInstance();httpTransport = GoogleNetHttpTransport.newTrustedTransport();GoogleCredential credential = GoogleCredential.getApplicationDefault();Storage storage = new Storage.Builder(
httpTransport,
JSON_FACTORY,
credential
).setApplicationName(
APPLICATION_NAME
).build();

The good news is that this boilerplate applies to all Google’s services, so you’ll see and can use this pattern for any of the other services.

The API Client Libraries employ a consistent pattern to reference service resources and make method calls against these:

service.resources().verb().execute()

You’ll see in the code below that we employ this pattern twice: once to get the project’s buckets and once to get a bucket’s objects:

storage.buckets().list().execute()
storage.objects().list().execute()

As with the other examples in this series, while it’s possible to understand the API by reviewing the API documentation, I find it easiest to replicate my intent with API Explorer and then encode that.

So, for example, to enumerate the buckets in a project, it’s possible to eyeball the API Explorer entries and see that storage.buckets.list “Retrieves a list of buckets for a given project”

Cloud Storage JSON API v1

Then, creating a call with storage.buckets.list returns the project’s buckets as expected. NB While I don’t show the bucket’s details in this screenshot, the 200 indicates the success of the call:

As you become familiar with the API Client Libraries, you’ll find it easier to stamp out code against different services. Because, while it’s not necessarily idiomatic, it’s consistent. Learn how to code one service and you’ll be able to code against any of them.

So, following on and, in truth, taking a quick gander at the Java documentation:

https://developers.google.com/resources/api-libraries/documentation/storage/v1/java/latest/

It’s relatively straightforward to get to the following code:

Storage.Buckets.List bucketsList = storage.buckets().list(PROJECT_ID);
Buckets buckets;
do {
buckets = bucketsList.execute();
List<Bucket> items = buckets.getItems();
if (items != null) {
for (Bucket bucket: items) {
System.out.println(bucket.getName());
}
}
bucketsList.setPageToken(buckets.getNextPageToken());
} while (buckets.getNextPageToken()!=null);

The most complicated aspect of this code is the paging ;-) For performance reasons and because the enumeration could be huge, the best practice implemented by the API is to provide the bucket list in pages. To enumerate the entire list, it’s necessary to retrieve each page then, get a token for the next page and pass this back to the API and re-execute the API call.

You’ll be reassured to see that the code to enumerate the objects within a bucket is very similar in structure to the bucket enumeration code:

Storage.Objects.List objectsList = storage.objects().list(BUCKET_NAME);
Objects objects;
do {
objects = objectsList.execute();
List<StorageObject> items = objects.getItems();
if (items != null) {
for (StorageObject object : items) {
System.out.println(object.getName());
}
}
objectsList.setPageToken(objects.getNextPageToken());
} while (objects.getNextPageToken()!=null);

Once again, the complexity is mostly in the loop that repeatedly executes the API call until there are no more pages of objects to retrieve. Returning to the API Explorer, this method implements storage.objects.list which “Retrieves a list of objects matching the criteria”:

Putting these together and adding in the many import statements that you’ll need:

The pom.xml:

The CloudStorage.java:

There are likely better ways to build the solution and run it but my approach is:

mvn clean install --quiet-------------------------------------------------------
T E S T S
-------------------------------------------------------
Running com.google.dazwilkin.CloudStorageTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.028 sec
Results :Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

and then:

mvn exec:java \
-define exec.mainClass="com.google.dazwilkin.CloudStorage" \
--quiet
[[BUCKET]]
01
02
03
04
05
06
07
08
09
10

Solution #2: Using Cloud Client Libraries

As with solution #1, let’s use Maven to generate the project structure for us:

mvn archetype:generate \
--batch-mode \
--define archetypeGroupId=org.apache.maven.archetypes \
--define groupId=com.google.dazwilkin \
--define artifactId=cld

Open pom.xml:

<project>
<modelVersion>4.0.0</modelVersion>
<groupId>com.google.dazwilkin</groupId>
<artifactId>cld</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<name>cld</name>
<url>http://maven.apache.org</url>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
</project>

This time we’ll add the Cloud Client Library packages for the core and for Cloud Storage as well as the Application Default Credentials package. Your pom.xml should resemble this:

<project>
<modelVersion>4.0.0</modelVersion>
<groupId>com.google.dazwilkin</groupId>
<artifactId>cld</artifactId>
<packaging>jar</packaging>
<version>1.0-SNAPSHOT</version>
<name>cld</name>
<url>http://maven.apache.org</url>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- Google Cloud Core -->
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-core</artifactId>
<version>1.6.0</version>
</dependency>
<!-- Google Cloud Storage -->
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-storage</artifactId>
<version>1.6.0</version>
</dependency>
<!-- Application Default Credentials -->
<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-oauth2</artifactId>
<version>v2-rev129-1.22.0</version>
</dependency>
</dependencies>
</project>

Then open the generated App.java:

package com.google.dazwilkin;public class App 
{
public static void main( String[] args )
{
System.out.println( "Hello World!" );
}
}

Optional: I recommend renaming App.java to CloudStorage.java and rename the class App to CloudStorage too.

OK…. Let the fun begin.

It happens that the Google Cloud documentation now generally used Cloud Client Libraries samples (rather than the earlier API Client Libraries) and so there are a few samples that help get us started with Cloud Storage:

https://github.com/GoogleCloudPlatform/google-cloud-java/tree/master/google-cloud-storagehttps://cloud.google.com/storage/docs/reference/libraries#client-libraries-usage-java

But, honestly, as a Java dilettante, being left with these lightweight samples and API documentation is far from sufficient for me to get going. So, I found this harder than the API Client Library solution (with that, other language consistency and the ability to use Google API Explorer are very helpful).

Here’s the API documentation:

https://googlecloudplatform.github.io/google-cloud-java/0.24.0/apidocs/index.html
https://googlecloudplatform.github.io/google-cloud-java/0.24.0/apidocs/com/google/cloud/storage/package-summary.html

Absent samples, it’s unclear to me from these API docs how to instantiate the service, how I use Bucket|BucketInfo and Blob|BlobInfo, what methods these support etc. :-( How do other folks do this?

So… plagiarizing…

The GitHub documentation does a decent job explaining the auth flow:

https://github.com/GoogleCloudPlatform/google-cloud-java#authentication

I found that the following code is sufficient when using ADCs and running either locally (off GCP) or on a Compute Engine instance:

Storage storage = StorageOptions.getDefaultInstance().getService();

The GitHub documentation provides an example of enumerating a project’s buckets and then the objects within each bucket:

https://github.com/GoogleCloudPlatform/google-cloud-java/tree/master/google-cloud-storage#listing-buckets-and-contents-of-buckets

I tweaked this slightly to mirror the code in solution #1 which enumerates all the buckets and then enumerates the objects in a specific bucket:

for (Bucket bucket : storage.list().iterateAll()) {
System.out.println(bucket.getName());
}
Bucket bucket = storage.get(BUCKET_NAME);
for (Blob blob : bucket.list().iterateAll()) {
System.out.println(blob.getName());
}

Honestly, I found the example of applying iterateAll to the list method only as I’m writing this sample up now:

https://googlecloudplatform.github.io/google-cloud-java/0.24.0/apidocs/com/google/cloud/storage/Storage.html#list-java.lang.String-com.google.cloud.storage.Storage.BlobListOption...-

but I’m using:

for (Bucket bucket : storage.list().iterateAll()) {
System.out.println(bucket.getName());
}

And then, for the objects in a specific bucket (BUCKET_NAME):

Bucket bucket = storage.get(BUCKET_NAME);
for (Blob blob : bucket.list().iterateAll()) {
System.out.println(blob.getName());
}

Putting this together with the pom.xml and the imports you’ll need:

and:

And, as before:

mvn clean install --quiet-------------------------------------------------------
T E S T S
-------------------------------------------------------
Running com.google.dazwilkin.CloudStorageTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.007 sec
Results :Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

and then:

mvn exec:java \
--define exec.mainClass="com.google.dazwilkin.CloudStorage" \
--quiet
dazwilkin-1708301815
01
02
03
04
05
06
07
08
09
10

Tidy-up

You may delete the bucket (and its objects) when you’re done. Be *very* careful that you specify the correct bucket when you perform this delete. It deletes all the objects and then the bucket:

gsutil rm -r gs://${BUCKET}

Conclusion

The API Client Libraries for Java are straightforward. Once you’re familiar coding against one service, you’re familiar with how to code against Google service. The ability to use API Explorer to both introspect services and their methods and then to test the APIs, makes API Explorer a boon when developing with the API Client Libraries.

For Cloud Client Libraries for Java, I feel almost as unfamiliar with these as when I start this post :-( The samples provided on cloud.google.com for each service and the GitHub examples are good but, these are insufficient. I hope that this presents a challenge only to me but I worry that other developers may struggle building against these Libraries too.

--

--