Automate PDF Data Extraction using Gemini 1.5 Flash and Google Apps Script
Originally published on Premier Cloud Blog
Welcome!
Small to mid-size businesses receive PDF invoices from vendors and need to store details like invoice number, date of invoice, total amount, and vendor name. Doing this manually for all your vendors is a tedious and time-consuming process.
In this blog we are going to be automating the process of extracting details from PDF’s and populating our Google Sheet using the Gemini 1.5 Flash model and Google Apps Script.
So let’s get started.
What can this Automation do?
This function is designed to automate the process of manual data entry inside of Google Sheets. This function uses the Gemini 1.5 Flash model to extract required/ specified details from the Google Sheet and populate the Google Sheet.
Now that we have understood the problem and the possible solution, lets get coding.
Sample Google Sheet
The Google Sheet that I will be using for this blog contains the following details:
- Invoice Number
- Date
- Vendor Name
- Item Description
- Tax
- Total Amount
- Due Date
- Notes
- Last Updated
If you want to work with this sheet, click here.
Step1: Write the Automation Script
While you are in the Google Sheet, let’s open up the Script Editor to write some Google Apps Script. To open the Script Editor, follow these steps:
- Click on Extensions and open the Script Editor.
2. This brings up the Script Editor as shown below.
We have reached the script editor, let’s code.
function processPdfToSheet() {
var archiveFolderId = "1zfGNBzbPr6jqKGb2fBRIl5exDy9OxG-t";
const folderId1 = "1iWRHfRWKItsU3oXJyucw5y4Ubp2lU8zw";
var folder = DriveApp.getFolderById(folderId1);
var files = folder.getFiles();
while (files.hasNext()) {
var file = files.next();
if (file.getMimeType() === MimeType.PDF) { // Filter PDF files
var fileId = file.getId();
var pdfContent = convertPdfToGoogleDoc(fileId, folder);
var responseData = sendToGemini(pdfContent);
var details = extractFields(responseData);
// Update Google Sheet with extracted details
updateSheet(details);
// Move the original PDF and the converted Google Doc to the archive folder
var archiveFolder = DriveApp.getFolderById(archiveFolderId);
moveFileToArchive(file, archiveFolder);
}
}
}
The processPdfToSheet()
is the main function that initiates the process of extracting data from PDF’s and updating the spreadsheet.
We start of by declaring two Folder IDs. The archiveFolderId()
stores the ID for the folder where the processed PDFs and converted Google Docs are archived. The folderId1()
stores the ID of the folder containing the PDF’s that need to be processed.
Once we have the folder ID, we use the getFolderByID()
function, to fetch the folder. On accessing the folder, we iterate through all the files in the folder. For each file, it checks if the file is a PDF.
If so, it converts the PDF to a Google Doc, extracts the text content, processes it using the Gemini AI model, extracts relevant details from the model’s response, updates a spreadsheet with these details, and finally moves the original PDF and converted Google Doc to an archive folder.
function convertPdfToGoogleDoc(fileId, folder) {
var file = DriveApp.getFileById(fileId);
var blob = file.getBlob();
var newFileName = file.getName().replace(/\.pdf$/, '') + ' converted';
var resource = {
title: newFileName,
mimeType: MimeType.GOOGLE_DOCS
};
var options = {
ocr: true,
ocrLanguage: 'en'
};
var convertedFile = Drive.Files.create(resource, blob, options);
var doc = DocumentApp.openById(convertedFile.id);
var pdfContent = doc.getBody().getText();
var convertedFileObj = DriveApp.getFileById(convertedFile.id);
convertedFileObj.setTrashed(true); // Move to trash
return pdfContent;
}
The convertPdfToGoogleDoc()
function processes a PDF file by retrieving its content, creating a new Google Doc with extracted text, and then deleting the converted Google Doc to avoid clutter. We use the Drive API and DocumentApp to perform these operations, ensuring accurate text extraction.
We then return the extracted text content, which can be further processed or used in applications
function sendToGemini(pdfData) {
const GEMINI_KEY = 'YOUR_GEMINI_KEY';
const GEMINI_ENDPOINT = `https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent?key=${GEMINI_KEY}`;
var headers = {
"Content-Type": "application/json",
"Accept": "application/json"
};
var requestBody = {
"contents": [
{
"parts": [
{
"text": `extract the following details: Vendor Name: Invoice Number: Amount Due: Due Date: Description Tax: \n${pdfData}`
}
]
}
]
};
var options = {
"method": "POST",
"headers": headers,
"payload": JSON.stringify(requestBody)
};
try {
var response = UrlFetchApp.fetch(GEMINI_ENDPOINT, options);
var datanew = JSON.parse(response.getContentText());
return datanew;
} catch (error) {
Logger.log('Error calling Gemini API: ' + error);
return null;
}
}
We then use the sendToGemini()
function to interact with Gemini 1.5 Flash to process and extract data from the text content. We construct a request that contains our prompt that specifies the details we want to extract from the content and then send the request to Gemini 1.5 Flash.
In case of errors, it logs an error message and returns null
. If successful, it returns the extracted details that we received from Gemini 1.5 Flash.
function extractFields(datanew) {
if (!datanew || !datanew.candidates || !datanew.candidates.length) {
Logger.log('No valid data returned from Gemini.');
return {};
}
var textContent = datanew.candidates[0].content.parts[0].text;
textContent = textContent.replace(/- /g, '').trim();
var lines = textContent.split('\n');
var details = {};
lines.forEach(function (line) {
var parts = line.split(':');
if (parts.length === 2) {
var key = parts[0].replace(/\*\*/g, '').trim();
var value = parts[1].replace(/\*\*/g, '').trim();
details[key] = value;
}
});
return details;
}
We use the extractFields()
function to process the response from Gemini 1.5 Flash to extract relevant details. We check the validity of the response, extract the text content, split it into lines and iterate through each line to identify key-value pairs.
These extracted details, such as Vendor Name, Invoice Number, and others, are returned in a structured object for further use.
function updateSheet(details) {
var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Invoices");
var range = sheet.getDataRange();
var values = range.getValues();
var vendorName = details['Vendor Name'];
var nameFound = false;
var currentDate = Utilities.formatDate(new Date(), Session.getScriptTimeZone(), 'MM/dd/yy');
var formattedDateTime = Utilities.formatDate(new Date(), Session.getScriptTimeZone(), "yyyy-MM-dd HH:mm:ss");
for (var i = 1; i < values.length; i++) {
if (values[i][2].toLowerCase() === vendorName.toLowerCase()) { // Compare by Vendor Name
nameFound = true;
sheet.getRange(i + 1, 1).setValue(details['Invoice Number']); // Column A
sheet.getRange(i + 1, 6).setValue(details['Amount Due']); // Column F
sheet.getRange(i + 1, 7).setValue(details['Due Date']); // Column G
sheet.getRange(i + 1, 9).setValue("Last updated at: " + formattedDateTime); // Column I
Logger.log("Updated Row " + (i + 1));
break;
}
}
if (!nameFound) {
Logger.log("Vendor not found: " + vendorName);
var newRow = values.length + 1;
sheet.getRange(newRow, 1).setValue(details['Invoice Number']); // Column A
sheet.getRange(newRow, 3).setValue(vendorName); // Column C
sheet.getRange(newRow, 4).setValue(details['Description']); // Column D
sheet.getRange(newRow, 5).setValue(vendorName); // Column E
sheet.getRange(newRow, 6).setValue(details['Amount Due']); // Column F
sheet.getRange(newRow, 7).setValue(details['Due Date']); // Column G
sheet.getRange(i + 1, 9).setValue("Last updated at: " + formattedDateTime); // Column I
Logger.log("New Row Added");
}
}
The updateSheet()
function is responsible for managing entries within the Google Sheet. It first checks if the client or vendor already exists in the sheet. If a matching entry is found, the function updates the existing record with the new details. However, if no existing record is found, a new entry is created to store the information.
function moveFileToArchive(file, archiveFolder) {
file.moveTo(archiveFolder);
}
The moveFileToArchive()
function is a simple help function that takes the file and moves it to the archive folder after the data extraction and Google Sheet updation is complete.
Our code is complete and good to go.
Step2: Check the Output
Its time to see if the code is able to access the invoices, extract details and update the Google Sheet successfully.
You can either run the code directly from the editor or use custom menus. Check out the tutorial give below to know more about different ways to execute your Google Apps Script code.
On running the code you should get an output like this in the Execution Log.
Here you can see that on successful execution it could identify that there was no vendor named Stellar, and it created a new row for it. You can check the results inside the Google Sheet too.
It has created a new row in the Google Sheet for the new vendor details. This is how we have automated the process of data extraction using the Gemini 1.5 Flash model and Google Apps Script.
Conclusion
In conclusion, the use of Google Apps Script can automate the process of data entry from PDFs or even Documents. The functionality is helpful for small businesses that want to speed up the process of data entry. This approach not only saves time but also enhances your overall systems and processes.
You can check out the code by checking out the link given below.
Feel free to reach out if you have any issues/feedback at aryanirani123@gmail.com.