Analyzing sentiment of text with domain-specific vocabulary and topics
Part 2: Making sentiment predictions with AutoML from Google Sheets
In my last blog post, I ran through a quick demo showing why and how you might create your own custom model for analyzing sentiment of text. Google Cloud’s pre-trained model for sentiment analysis works well in many cases, but it isn’t necessarily trained to analyze sentiment when phrases include domain-specific terminology. Luckily, AutoML Natural Language allows you to customize a sentiment model, without needing ML expertise or a development background; good news for a data analyst like me!
By the end of the post, the custom model was up and running and you were able to predict sentiment scores for new, airline-related tweets right in the web UI of AutoML. But since a lot of data lands or lives in spreadsheets, I’m going to show you how to access the AutoML custom model to predict the sentiment of text directly from Google Sheets.
We’ll pick up where you left off: with a shiny, new AutoML custom model in your hands! The rest of this post will walk you through how to:
- Gather your data for prediction in a Google Sheet
- Write a function to call the AutoML API
- Configure OAuth and service account for accessing your AutoML model
- Automate your Google Sheet to predict sentiment for new tweets, and
- Set up a trigger to process larger sets of data
Gather your data for prediction in a Google Sheet
Making predictions from a Google Sheet is just one of many options, but I find it a happy medium between “manual copy-and-paste with the AutoML web UI” and “building a fully-fledged application”. So first, you need to get a new Google Sheet prepared with the data for which you’d like make sentiment predictions.
- Create a new Google Sheet: sheet.new
- Rename the tab to “Tweet Data”.
- Add a header in row 1 with column A as “tweet_text” and column B as “automl_sentiment_output”.
- Add some sample tweets. I took mine from the original dataset, but ideally you should source some new tweets!
Write a function to call the AutoML API
- From your spreadsheet, access the Script Editor from the Tools menu.
- Copy and paste the following code into the code.gs file.
3. Replace YOUR_PROJECT_ID and YOUR_MODEL_ID on lines 2 and 3 with your unique project ID and model ID. You can find them on the Predict tab of the AutoML Natural Language UI by scrolling down to the Use your custom model section and selecting REST API. The URL listed in the code block contains both items in this format: https://automl.googleapis.com/v1beta1/projects/YOUR_PROJECT_ID/locations/us-central1/models/YOUR_MODEL_ID:predict.
Configure OAuth and service account for accessing your AutoML model
Accessing our custom model from Apps Script requires a service account. This is a set of credentials that you create in your Google Cloud Platform project so that AutoML can ensure that only authorized users are accessing the model. These credentials are included in your Apps Script code.
- In the Cloud Console, navigate to APIs & Services, and then Credentials. Under Create Credentials, choose Service account key.
- Create a new service account, provide a name (such as ‘automl-sheets’), and select a role of AutoML Predictor. Click Create. A file will automatically download to your computer.
- Open your script editor in Google Sheets and create a new file by going to the menu File → New → Script file. Name the new file auth.gs.
- Copy and paste the following code into the auth.gs file.
5. Open the file downloaded during your service account creation. Locate “client_email”. Copy and paste this in to replace YOUR_SVCACCT_EMAIL in line 8 of auth.gs. Maintain the single quotation marks.
6. Now locate and copy the text next to “private_key” that starts with “ — — -BEGIN PRIVATE KEY”. This is an extremely long key and you’ll need to make sure you copy it in its entirety. Paste this in to replace YOUR_PRIVATE_KEY in line 6 of auth.gs. Maintain the single quotation marks.
The code requires the OAuth2 library, which is a project with functions that can be used in your own script. To add the library, follow these steps:
- Click on the menu item “Resources > Libraries…”
- In the “Find a Library” text box, enter the script ID
1B7FSrk5Zi6L1rSxxTDgDEUsPzlukDsi4KGuTMorsTQHhGBzBkMun4iDFand click the “Select” button.
- Choose a version in the dropdown box (usually best to pick the latest version).
- Click the “Save” button.
Automate your Google Sheet to predict sentiment for new tweets
The code as it stands contains a function that requires a block of text as an input. That text is going to come from your spreadsheet. You can write a second function that will loop through each row of your spreadsheet, send the tweet text to AutoML, and paste the sentiment score response next to the tweet in the spreadsheet.
- Copy and paste the following code into the code.gs file, inserting it above the code already present.
2. Save the code in the editor and then navigate back to the Google Sheet.
Now you are ready to test making predictions in your Google Sheet. At the time of writing this post, you can predict on up to 30k text records a month for free, so this part will be free as long as you remain under that threshold.
3. Reload the Google Sheet in the browser. You will see a new menu called AutoML Tools appear next to the Help menu.
4. Under the AutoML Tools menu, click on Mark Custom Sentiment.
5. Authorize the code to run by clicking Continue on the pop-up dialog, signing into your Google account, clicking on Advanced at the bottom of the “This app isn’t verified” screen, and finally reviewing the permissions and clicking Allow. You can see a very similar flow on this gif from Ben Collins’ blog.
6. After authorizing, the code will start running and you will see sentiment scores appear in the automl_sentiment_output column.
Set up a trigger to process larger sets of data
In order to safely stick within the usage quota of 600 prediction requests per minute per project, the code is set to process up to 400 lines of tweets. If you’d like to process more than 400 tweets, you can set up a trigger to run the code every minute, and the script will automatically process 400 additional tweets with each run. Here’s how to set up that trigger:
- Open the script editor and click on Edit → Current project’s triggers. This will open the trigger configuration page for this specific script in the G Suite Developer Hub.
- Click on + Add Trigger in the bottom right-hand corner.
- Complete the choices as follows to create a time-driven trigger for your markSentiment function:
4. Once you hit save, your script will start running once-per-minute, processing up to 400 tweets each execution.
5. IMPORTANT: Once predictions have completed for your entire dataset, you need to delete the trigger. You can do this by clicking the three dots next to the trigger in the G Suite Developer Hub and choosing Delete.
If you do not delete the trigger, the script will continue to run each minute. While the code is written to skip all processed rows, there is still a risk that issues could cause the code to call the API and therefore produce charges past the free tier for predictions. Additionally, the continued activity such as trigger runtime and execution time will count against your Apps Script quotas.
I’m so excited about the possibilities that Google Cloud AI building blocks (including the pre-trained ML APIs and AutoML products) bring to Sheets users! If you haven’t taken a stab at following this tutorial, I recommend that you try it. All of the code you need it linked in this post.
Share your experience, and ideas you have for other ways to use ML from Sheets by reaching out on Twitter @presactlyalicia, or in the comments section below.