Using Spreadsheet with Python and a service account

Matheo Daly
5 min readFeb 8, 2023

--

Photo by James Harrison on Unsplash

One day in your company, your boss or yourself would want you to do some job available on a spreadsheet. If you’re a Data -whatever-, it could sound like a pretty boring job to do. But the fact is that it’s easily shareable and if you have to work on a project fastly, you may need to do it first on this kind of tool before integrating it to a Data Engineering Pipeline and store it in a more relevant Database.

Nevertheless, let’s make it a bit more interesting by automating this job with Python, and you could tell your boss that you’re update it manually every week and play Mario Kart every Friday afternoon. -just don’t snitch me okay ?-

1. Create a Service Account with Google Cloud Platform

You may not have yet created a service account and even don’t know what it means. I know it’s in the title of the article and some of the readers will find it pretty obvious, but I want this to cover the subject as much as possible.
You could find a tutorial I made on another article on how to create a service account on the Google Cloud Platform and create a JSON key associated to it on the link just below.

2. Enable Google Sheet API on your Google Cloud Platform Project

Now that you have a Google Cloud project, and a service account created, it’s time for you to enable the Google Sheet API.
Indeed, it’s not done automatically, but it’s pretty straightforward.
Just click on the Enabled APIs and services on the left navigation menu as shown on Figure 1.

Figure 1: Enable APIs and services navigation button

And then click on the button at the top named Enable APIs and services as shown on Figure 2.

Figure 2: Enable APIs and services

On the search bar at the center search for Google sheets and you will likely have only one result, click on this one and then the Enable button when you arrive on its page.

3. Install and import needed Librairies

You’ll need only two librairies here
pandas
google-api-python-client

As imports just type the following

import pandas as pd
from google.oauth2 import service_account
from googleapiclient.discovery import build

4. Create the connection to the Spreadsheet API

Just change of course the service_account_json_key for it to match with your JSON key local path.

5. Accessing Data on a Specific Spreadsheet

First of all, you’ll need to give an editor access for your service account to your Spreadsheet. To do so, just retrieve your service account email which you can find for example on Figure 5 of my previous article on how to create a service account. You could also find it opening your service account JSON key and finding the value associated to the client_email key.
Then, just click on the Share button at the top right of the Spreadsheet and share it with the email of your service account, giving it Editor access to the Spreadsheet.

This done, you can start coding.

For the spreadsheet_id, make sure to enter the id you can find on any spreadsheet url after the /d/ as shown on Figure 3.

Figure 3: id in a Spreadsheet url

The range variable contains the range of the data you want to access. In this example, I want to access A and B columns from the sheet named Sheet1. If I wanted A to J from the Sheet named example it would have been example!A:J.
Now you have the data contained on your Spreadsheet in a pandas DataFrame.

Warning

You must for this API call to work not have any empty cells for the selected range. As the API collects lists of lists, it does not understand empty cells as null values and you would have lists of different sizes, which results in a pandas ValueError Arrays Must be All Same Length when you convert them into a pandas DataFrame.

6. Write Data to a Spreadsheet

And in the end, you have done all your preprocessing, and you may want to write the preprocessed data into the Spreadsheet again from your DataFrame. Nothing complicated though.

In this example we have taken the same spreadsheet_id and range as before, in order to overwrite the data. But let’s decompose and clarify the code.

data_to_write: list = df.T.reset_index().values.T.tolist()

This line converts the DataFrame into a 2 dimensional array, which is necessary for writing with this API, we can’t send directly pandas DataFrame into the Spreadsheet.

service.values().clear(spreadsheetId=spreadsheet_id,
range=range).execute()

This line delete the data already present in the spreadsheet for the given range. As overwriting is not currently supported by the API, you have to delete yourself the data first and then writing into the empty cells of your Spreadsheet.

service.values().update(spreadsheetId=spreadsheet_id,
range=range,
valueInputOption="USER_ENTERED",
body={"values": data_to_write}).execute()

And finally, writing the data into the given range. You can see that I’ve set the parameter valueInputOption as “USER_ENTERED”. You can find documentation of this parameter just right here. This parameter makes the values we entered be interpreted by the Spreadsheet as if a user has entered them. It means that integers will be treated as is, but Strings could be converted as Date, or Numbers per example.

Conclusion

Hope you enjoyed this tutorial, and will enjoy doing some preprocessing with Python directly into your Spreadsheets.
The Google Workspace suite allows you to do a lot of things with their APIs, and you can use most of them in Python, so do not hesitate to explore them and find new automation ideas !

And if you want more tutorials, check out my other ones !

--

--