The Bookshelf Analytics App

Abirami Sukumaran
Google Cloud - Community
9 min readJul 2, 2024

A Google Cloud and MongoDB Atlas (Generative AI) story…

Co-authored by Stanimira Vlaeva, Developer Advocate, MongoDB.

Introduction

Do you love digging into books but get overwhelmed by the sheer volume of choices? Imagine having an AI-powered app that not only recommends the perfect read but also offers a concise summary based on your genre of choice, giving you a glimpse into the book’s essence. In this lab, I’ll walk you through building such an app with BigQuery, Gemini, Cloud Functions and MongoDB Atlas.

Project Overview

Our use case centers around these key components:

  1. Book Database: The vast BigQuery public dataset of internet archive books will serve as our comprehensive book catalog.
  2. AI Summarization Engine: Google Cloud Functions, equipped with the Gemini-Pro language model, will generate insightful summaries tailored to user requests.
  3. BigQuery Integration: A remote function within BigQuery that calls our Cloud Function to deliver on-demand book summaries and themes.
  4. User Interface: A web app hosted on Cloud Run that will offer a web application for users to view the results.

We will divide the implementation into 3 parts:

Part 1: Build a Java Cloud Function for a Gemini application.

Part 2: Build SQL-only Generative AI applications with BigQuery.

Part 3: Build a Cloud Run deployed application that takes the BigQuery analytics results to the web.

Part 1: Build a Java Cloud Function for a Gemini application

Create a Java Cloud Functions application that implements Gemini-Pro to take the prompt as input in the form of JSON Array and returns a Json value labeled “replies”.

Refer codelab to complete this step:

Part 2: Build SQL-only analytics with BigQuery and Generative AI

The overview of this part is as follows:

  1. Use the BigQuery public dataset of internet archive books to get a snapshot of the book database.
  2. Create a BigQuery dataset named “bookshelf” to handle the new table, model and function that we will create in this project.
  3. Create a remote model in BigQuery that invokes the Vertex AI text-bison-32k endpoint to identify the genre (or theme) of the book from a list of “;” separated keywords in the table.
  4. Create a remote function in BigQuery that will invoke this deployed generative AI Cloud Function remotely. This function will take the prompt as input and output a string that summarizes the book in 5 lines.
  5. Use the remote model and function to summarize the theme and text of a book with SQL queries and write the results to a new table in the bookshelf dataset.

Refer codelab to complete this step:

Part 3: Integrate BigQuery data with MongoDB

Set up MongoDB Database

  1. Register for a free MongoDB Atlas account.
  2. Log into your Atlas account and deploy a free M0 cluster.

3. Complete the security quickstart by creating a database user and adding access from anywhere — 0.0.0.0/0.

4. Navigate back to your Database deployments from the left sidebar.

5. Your database deployment is ready to use! Let’s get the connection string and use Google Cloud Dataflow to replicate the dataset from BigQuery into our operational Atlas database. Click Connect then, Drivers, and finally copy the connection string.

Integrate BigQuery Data with MongoDB

Data integration between BigQuery and MongoDB can be done using Dataflow jobs.

  1. Go to Google Cloud Console, type Dataflow in the search bar on top and hit enter.
  2. Select Dataflow (Streaming Analytics Service) from the list.

3. Click CREATE JOB FROM TEMPLATE on the Dataflow dashboard

4. Enter a job name of your choice and region (us-central1) and select the dataflow template “BigQuery to MongoDB”.

5. Enter the required parameters to set the destination (MongoDB) and source details (BigQuery) and click RUN JOB button:

The job will be complete in a few minutes.

6. Navigate back to your Atlas instance and click Browse Collections. You should see the replicated book data in the bookshelf.books collection.

We have successfully integrated BigQuery data in MongoDB. Remember, this feature is bidirectional, which means we have a Dataflow job template for sending MongoDB updates to BigQuery as well.

Build a no-code REST API

MongoDB Atlas offers a fully managed API service integrated with your cloud database — the Data API. It allows you to easily bootstrap a no-code REST API and connect from any platform supporting HTTPS.

Open the Data API from the Services section on the left sidebar. Enable the API and generate an API key.

Please make sure to copy the API key that has been generated and store it in a safe place. You will need it shortly but once you close the dialog box, you will not be able to view the API key again. After copying the API key, you can close the dialog box and then copy and save the URL Endpoint for later use. This endpoint is what we will use to access our Atlas data.

Build a Cloud Run application that takes the MongoDB data to the web

Create a Cloud Run web application that will interact with the MongoDB data and list the books, theme and summary.

So far we have set up the database and analytics components for our web application. Let’s move on to the web application development part.

  1. Create the basic Java Cloud Run template

Once you have the editor open, make sure that your Google Cloud Project on the bottom left corner is pointing to your current active project that you want to work with. If they are inactive, click them, authorize, select the Google Cloud Project that you want it to point to and get them activated.

Once both are active, click the project name in the bottom left corner and in the pop up list that opens up titled “Cloud Code”, scroll down to “New Application”.

In that list, select Cloud Run application. From the list that pops-up, select Java:

In the resulting list, type the project name “bookshelf-web” instead of “helloworld” and click OK.

This is the project structure you should see:

Right now, you are good to deploy the application. But that’s not why we started this. We still need to include the functionality of the web application which is to fetch the MongoDB data to the web or take the data from the BigQuery database and display it on the web.

2. Add dependencies to use BigQuery in the web app

In the Cloud Code Editor pom.xml file, right above the </dependencies> tag, type in the following prompt comment:

<! — What maven dependency should I include to access BigQuery in the app →

I got this result in response:

<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-bigquery</artifactId>
</dependency>

Also remember to include the below dependency to invoke the MongoDB API:

<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.9.1</version>
</dependency>

3. Update source to get the bookshelf data to web

Replace the HelloWorldController.java code with the following:

package cloudcode.helloworld.web;
import java.util.UUID;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.JobId;
import com.google.cloud.bigquery.Job;
import com.google.cloud.bigquery.JobInfo;
import com.google.cloud.bigquery.QueryJobConfiguration;
import com.google.cloud.bigquery.TableResult;
import com.google.cloud.bigquery.FieldValueList;
import okhttp3.*;


@RestController
public final class HelloWorldController {


/**
* Create an endpoint for the landing page
*
* @return the MongoDB data results string to the web
*/
@GetMapping("/mongo")
public String callMongo() throws Exception {
String responseString = "";


//write another line for the below statement:
OkHttpClient client = new OkHttpClient().newBuilder().build();
MediaType mediaType = MediaType.parse("application/json");
RequestBody body = RequestBody.create(mediaType, "{\n \"collection\":\"books\",\n \"database\":\"bookshelf\",\n \"dataSource\":\"Cluster0\"");
Request request = new Request.Builder()
.url("https://<<YOUR_APP_ID>>/endpoint/data/v1/action/find")
.method("POST", body)
.addHeader("Content-Type", "application/json")
.addHeader("Access-Control-Request-Headers", "*")
.addHeader("api-key", "<YOUR_API_KEY>")
.build();
Response response = client.newCall(request).execute();
responseString = response.body().string();
System.out.println(responseString);
return responseString;
}




/**
* Create an endpoint for the landing page
* @return the BigQuery analytics results string to the web
*/


@GetMapping("/")
public String helloWorld() throws Exception {
/* Connect to bigquery and write a select SQL to fetch Title, Theme and Summary fields from the table `bookshelf.bookshelf_theme` */


String query = "SELECT Title || ' (' || IFNULL(Context, 'No more context.')
|| ')' AS summary FROM bookshelf.books";


BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
QueryJobConfiguration queryConfig =
QueryJobConfiguration.newBuilder(query)
.setUseLegacySql(false)
.build();
// Create a job ID so that we can safely retry.
JobId jobId = JobId.of(UUID.randomUUID().toString());
Job queryJob = bigquery.create(JobInfo.newBuilder(queryConfig).setJobId(jobId).build());
// Wait for the query to complete.
queryJob = queryJob.waitFor();
// Check for errors
if (queryJob == null) {
throw new RuntimeException("Job no longer exists");
} else if (queryJob.getStatus().getError() != null) {
throw new RuntimeException(queryJob.getStatus().getError().toString());
}
// Get the results.
TableResult result = queryJob.getQueryResults();
String responseString = "";
// Print all pages of the results.
for (FieldValueList row : result.iterateAll()) {
responseString += row.get("summary").getStringValue() + ". \n";
System.out.printf("%s\n", row.get("summary").getStringValue());
}
return responseString;
}
}

Ensure that you have made the following changes to the source files:

1. In HelloWorldController.java Update the @Controller to @RestController.

2. Replace the content of the helloWorld() method to include the call to BigQuery, execution of the query that fetches the data and performs LLM analytics (using text-bison-32k internally) to consolidate the theme for the book.

3. Instead of returning the index view template on load, update it to return the response as a string to the web.

4. Update the HelloWorldControllerTests.Java file to comment out the current mvc.perform(…) invocation.

4. Build and deploy the web app to cloud

Copy the following gcloud run deploy command and run it in the Cloud Shell Terminal:

gcloud run deploy bookshelf-web \
--source . \
--allow-unauthenticated \
--region us-central1

It takes a few minutes and the app is deployed to Google Cloud serverless-ly and in a few steps:

Click the Cloud Run deployed app and view the result on the web:

As you can see, the book titles and their themes (consolidated using text-bison-32k) are listed on the web which completes our application.

Congratulations! We have successfully built, deployed and tested a Java Cloud Run web application to perform bookshelf analytics. As part of this project, we have built a Java Cloud Function that implements Gemini Pro and is compatible with BigQuery invocations. We have also implemented SQL-only analytics with BigQuery and Generative AI taking the results to the web with the help of a simple web application that runs with MongoDB data deployed on Cloud Run!

--

--

Abirami Sukumaran
Google Cloud - Community

Developer Advocate Google. With 18 years in data and software dev leadership, I’m passionate about addressing real world opportunities with technology.