Export Functionality via Google Cloud Functions and Java

Harsh Ghuriani
Globant
Published in
6 min readSep 30, 2022

Tech Used : Java, Gradle, GCP

Google Cloud Function (GCF) is a serverless architecture where you can build small cloud services. Services can be written down in different languages including Python, Node.js, Java, PHP, and a few others. GCF is aimed to serve one purpose, hence the service we create should not be quite large functionality. What I feel is that we should break down large functionalities into simpler & smaller services which can be written down into their own GCFs.

GCFs are triggered by different events. These events can either be HTTP triggers, event based triggers or some additional triggers. I am going to focus on the event based triggers, pub/sub trigger in particular.

Note : As of now, the maximum function execution time is 3600 seconds(60 mins) for HTTP-driven functions and 540 seconds(9 mins) for event-driven functions, after which the function times out.

Other triggers include cloud storage triggers (event based), cloud logging, cloud scheduler, cloud task and gmail.

Here I take an example of export functionality which we create as a service in our GCF. This GCF will be triggered by subscribing to an event emitted to a GCP topic.

Export Functionality

The export functionality that I create involves exporting data into an excel file. I then store this excel file to cloud storage.

This excel file can then be downloaded from storage using a pre-signed url. (I am not going to cover this part here)

We use Java in our GCF and for the excel generation I use Apache POI dependency.

Design :

End to End Export Functionality

Once the event is published from the source to the Pub/Sub topic, say export-topic, it will be subscribed by our subscription, say export-to-excel-sub, provided it passes the filters required (if any). This subscription will be a push subscription. Hence, on subscribing to the event, it will trigger our GCF.

To deploy the GCF, below command can be used :

gcloud functions deploy export-function --entry-point functions.ExportToExcel --runtime java17 --trigger-http --memory 512MB --allow-unauthenticated

where ExportToExcel is the name of the GCF’s qualified class name just similar to a SpringBoot Application class name.

The endpoint for our GCF can be fetched either from the GCF deployment logs or through this command, after deployment, on google cloud console :

gcloud functions describe export-function

Here, export-function is the name for our GCF which is registered with google cloud.

This is how an endpoint would like :

https://GCP_REGION-PROJECT_ID.cloudfunctions.net/export-function

Below is a sample of how my GCP subscription looks like with the push endpoint for GCF:

GCP subscription — export-to-excel-sub

To test the GCF locally, below command can be used to generate the endpoint:

gradle runFunction -Prun.functionTarget=functions.ExportToExcel

This provides us with a local endpoint, http://localhost:8080/ , which can be used to test the functionality.

Defining Dependencies

I have been using Gradle, so this is how the dependency part of my build.gradle looks like :

dependencies {
// Necessary to have Functions Framework API.
compileOnly 'com.google.cloud.functions:functions-framework-api:1.0.4'

// Necessary to run function locally
invoker 'com.google.cloud.functions.invoker:java-function-invoker:1.1.0'
// Needed to access Microsoft format files (Excel in our case)
implementation 'org.apache.poi:poi-ooxml:5.2.2'
// Only needed by the tests.
testImplementation 'com.google.cloud.functions:functions-framework-api:1.0.4'
testImplementation 'junit:junit:4.13.2'
testImplementation 'com.google.truth:truth:1.1.3'
testImplementation 'org.mockito:mockito-core:4.5.1'

}

To run the GCF locally, the function needs to be registered like below in build.gradle :

// Register a "runFunction" task
tasks.register("runFunction", JavaExec) {
main = 'com.google.cloud.functions.invoker.runner.Invoker'
classpath(configurations.invoker)
inputs.files(configurations.runtimeClasspath, sourceSets.main.output)
args(
'--target', project.findProperty('run.functionTarget') ?: 'functions.ExportToExcel',
'--port', project.findProperty('run.port') ?: 8080
)
doFirst {
args('--classpath', files(configurations.runtimeClasspath, sourceSets.main.output).asPath)
}
}

Creating the Source File

Inside the source directory of our project, I create our source file : ExportToExcel.java .

src/main/java/functions/ExportToExcel.java

This is how the initial file looks like :

package functions;

import com.google.cloud.functions.HttpFunction;
import com.google.cloud.functions.HttpRequest;
import com.google.cloud.functions.HttpResponse;
import java.io.BufferedWriter;
import java.io.IOException;

public class ExportToExcel implements HttpFunction {
// Cloud function to create an Excel containing some data and upload it onto cloud storage
@Override
public void service(HttpRequest request, HttpResponse response)
throws IOException {
BufferedWriter writer = response.getWriter();
writer.write("Excel Exporter");
}
}

Every Http triggered GCF implements the HttpFunction interface which is a functional interface. Hence an action needs to be provided to the service method. The service method is called to handle the Http request.

Data is read from the request and after our action is complete, the response should be written into response.

On reading the input request, I try to make a connection to the database and fetch the required data. The resultset is then processed and stored in the excel file.

Either HSSFWorkbook or XSSFWorkbook or SXSSFWorkbook class of apache poi library can be used to create an excel file.

XSSFWorkbook is slower as it keeps the all its cells in-memory till the time the excel is saved. Hence, after a point of time it runs out of the heap memory. On the other hand, SXSSFWorkbook is the streaming extension of XSSFWorkbook.

Note : SXSSFWorkbook implements a strategy that allows to write very large files without running out of memory as only a configurable portion of the rows are kept in memory at any one time.

The number of rows to be kept in-memory for SXSSFWorkbook, until they are flushed out, can be done during its object creation through an integer parameter named ‘rowAccessWindowSize’.

Assuming that I have a large resultSet fetched from the DB , below is my sample code for how to export the data into excel :

createExportFileForData method is used to populate the Excel file and upload it to GCS bucket. It returns either “EXPORTED” or “FAILED” string message to be passed on to the http response. It is the entry point to our functionality and takes the data to be exported as input.


public String createExportFileForData(ResultSet data) {
//Default file status
String file_status = "EXPORTED";
//Default_Window_Size is 100
try (SXSSFWorkbook workbook = new SXSSFWorkbook(SXSSFWorkbook.DEFAULT_WINDOW_SIZE)) {
//Sheet name as parameter
Sheet sheet = workbook.createSheet("SHEET_NAME");
createSheetHeader(workbook, sheet);
createSheetBody(workbook, sheet, data);
//To upload the file to GCS, we need byte array, hence we write the workbook to ByteArrayOutputStream.
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
workbook.write(outputStream);
uploadToBucket(outputStream.toByteArray(), "FileName.xlsx");
}
catch (Exception e) {
LOGGER.info("Failed to generate and upload file to storage");
LOGGER.warning("Error : " + e);
file_status = "FAILED";
}
return file_status;
}

createSheetBody method is used to write data into the Excel Sheet. It is called from createExportFileForData method.

    private void createSheetBody(SXSSFWorkbook workbook, Sheet sheet, ResultSet data) throws SQLException {

//Row 0 would be the header row
int rowNumber = 1;
while(data.next()) {
Row dataRow = sheet.createRow(rowNumber++);
createSheetCell(dataRow, 0, data.getObject("name", String.class));
createSheetCell(dataRow, 1, data.getObject("address", String.class));
// .
// .
// .
createSheetCell(dataRow, N, data.getObject("phone_no", String.class));
}
}

createSheetHeader method creates the header row for the data sheet. It is called from createExportFileForData method.

private void createSheetHeader(SXSSFWorkbook workbook, Sheet sheet) {
//Creating an SXSSFRow for row number 0 (index starts from 0)
Row headerRow = sheet.createRow(0);
Font headerFont = workbook.createFont();
headerFont.setBold(true);

fillHeaderData(headerRow);
}

fillHeaderData method is used to fill the header row (Column Names). It is called from createSheetHeader method.

private void fillHeaderData(Row headerRow) {
//Passing columnNo and value to the createSheetCell method
createSheetCell(headerRow, 0, "Name");
createSheetCell(headerRow, 1, "Address");
//Similar n cells
// .
// .
// .
createSheetCell(headerRow, N, "Phone Number");
}

createSheetCell method is used to create each cell of the row. It is called from createSheetHeader method.

private void createSheetCell(Row row, int columnNo, Object value) {
Cell cell = row.createCell(columnNo);
if (value == null) {
cell.setCellValue("");
} else if (value instanceof String) {
cell.setCellValue((String) value);
} else if (value instanceof Timestamp) {
cell.setCellValue(((Timestamp) value).toLocalDateTime().format(DateTimeFormatter.ISO_LOCAL_DATE_TIME));
} else if (value instanceof BigInteger) {
cell.setCellValue(((BigInteger) value).doubleValue());
} else if (value instanceof UUID) {
cell.setCellValue(value.toString());
} else if (value instanceof BigDecimal) {
cell.setCellValue(String.valueOf(value));
}
}

uploadToBucket method is called from createExportFileForData method and is used to upload my export file to the GCP bucket.

private void uploadToBucket(byte[] bytes, String fileName) {
String bucketName = "BUCKET_NAME";
//Using the google cloud lib classes to insert the byte data of the file into the blob and upload it to GCS bucket
BlobId blobId = BlobId.of(bucketName, fileName);
//BlobInfo will contain the info about in GCS
BlobInfo blobInfo = BlobInfo.newBuilder(blobId).build();
Storage storage = StorageOptions.newBuilder().setProjectId("PROJECT_NAME").build().getService();
storage.create(blobInfo, bytes);
}

Conclusion

GCFs, serving Function as a service(FaaS), are helpful when a functionality is to be performed in response to event triggers. GCFs are extremely useful in case of small requests aiming to fulfil single action/ purpose.

More languages and triggers are being continuously added to Google Cloud Functions.

--

--