Using Google Drive API with Python and a service account

Matheo Daly
8 min readMar 27, 2023

--

Photo by Kai Wenzel on Unsplash

Following this tutorial, you will be able to automate some processes requiring you to interact with files on Google Drive. This can be useful because the Drive could be a kind of Database where you share files with your corporates, that do not have access to tools like AWS S3 or GCP GCS.
This article will try to tackle the main features of the Google Drive API such as retrieving files, uploading files, deleting files, and sharing permissions. We will also see how to convert file types specific to the Drive such as Spreadsheets into types that are usable for your computer.

Let’s get started!

Summary

  1. Create a Service Account with the Google Cloud Platform
  2. Enable Google Drive API on your Google Cloud Platform Project
  3. Install and import needed Libraries
  4. Create the connection to the Drive API
  5. Retrieve all file information stored on your Drive
  6. Retrieve file metadata from a file ID
  7. Share permissions from a file ID
  8. Store locally a file from your Drive from a file ID
  9. Delete a file from your Drive from a file ID
  10. Upload a file to your Drive

1. Create a Service Account with the Google Cloud Platform

You may not have yet created a service account and even don’t know what it means. I know it’s in the title of the article and some of the readers will find it pretty obvious, but I want this to cover the subject as much as possible.
You can find a tutorial I made in another article on how to create a service account on the Google Cloud Platform and create a JSON key associated with it on the link just below.

2. Enable Google Drive API on your Google Cloud Platform Project

Now that you have a Google Cloud project, and a service account created, it’s time for you to enable the Google Drive API.
Indeed, it’s not done automatically, but it’s pretty straightforward.
Just click on the Enabled APIs and services on the left navigation menu as shown in Figure 1.

Figure 1: Enable APIs and services navigation button

And then click on the button at the top named Enable APIs and services as shown in Figure 2.

Figure 2: Enable APIs and services

On the search bar at the center search for Google Drive and you will likely have only one result, click on this one and then the Enable button when you arrive on its page.

3. Install and import needed Libraries

You’ll need only two libraries here
pandas
google-api-python-client

As imports type the following

import pandas as pd
from google.oauth2 import service_account
from googleapiclient.discovery import build
from googleapiclient.http import MediaIoBaseDownload
from googleapiclient.http import MediaFileUpload
import io
from googleapiclient.errors import HttpError

4. Create the connection to the Drive API

Just change of course the service_account_json_key for it to match with your JSON key local path.

5. Retrieve all files information stored on your Drive

First of all, you’ll need to share access for your service account to your desired folders and files. To do so, just retrieve your service account email which you can find for example in Figure 5 of my previous article on how to create a service account. You could also find it by opening your service account JSON key and finding the value associated with the client_email key.
Once this is done, you can share files and folders exactly like how you’d do with a user with the email you just retrieved.

# Call the Drive v3 API
results = service.files().list(pageSize=1000, fields="nextPageToken, files(id, name, mimeType, size, modifiedTime)", q='name contains "de"').execute()
# get the results
items = results.get('files', [])

You can see here that we specified the information that we want for the files. Indeed, we want to retrieve here only id, name, mimeType, size, and modifiedTime. Nevertheless, you may want to have more or other information. You can find more information about fields in the documentation.

Also, we added the parameter q, which specifies the filter we want for our query. More information about query filters can be found in the official documentation.

It’s technically all that you need to retrieve information on the files stored on your Drive. Nevertheless, the items variable is a dict and if we want to process the information or make it more readable for humans, we’ll need to do a bit of treatment on it.

data = []
for row in items:
if row["mimeType"] != "application/vnd.google-apps.folder":
row_data = []
try:
row_data.append(round(int(row["size"])/1000000, 2))
except KeyError:
row_data.append(0.00)
row_data.append(row["id"])
row_data.append(row["name"])
row_data.append(row["modifiedTime"])
row_data.append(row["mimeType"])
data.append(row_data)
cleared_df = pd.DataFrame(data, columns = ['size_in_MB', 'id', 'name', 'last_modification', 'type_of_file'])

In this code, we first iterate through the dict items and preprocess for every row the data, and store them into a list so that we can put them into a pandas dataframe in the end.
Note that we put a condition if the mimeType is a google drive folder, to exclude it from the process. “mimeTypes” describe types of the files, and Google Drive has its own MIME types. You can find an exhaustive list made by Mozilla here of all MIME types. You can also find a list of all MIME types relatives to Google file types on their documentation. We then convert the size which is in bytes into megabytes, and that’s all the preprocessing we do on the data.

This done, the cleared_df should look something like Figure 3 below.

Figure 3: Drive files information cleaned

6. Retrieve file metadata from a file ID

The metadata of a file is information associated with it. You can easily query them using this command :

file_metadata = service.files().get(fileId="your_file_id").execute()

replace only your_file_id but the of your file and you should have a dict again, containing the main information about the file such as the kind, id, name, and mimeType.

7. Share permissions from a file ID

new_permission = {
'type': 'user',
'role': 'writer',
'emailAddress' : 'youremail@gmail.com',
}
try:
service.permissions().create(fileId='file_id', body=new_permission, transferOwnership=False).execute()
except (AttributeError, HttpError) as error:
print(F'An error occurred: {error}')

With the code above, you can share specific permissions from a file to a specific user, group, or service account. The writer role allows the user to edit the document, but not to delete it completely. If you want to add those rights to a user, be sure to pass the parameter transferOwnership to True.
However, you may not be able to transfer the ownership from a file that has been created from a service account, but we’ll see in the next section how to delete files using code from your service account. You can find more information about this in the documentation. Note that the Parameter section concerns the parameters of the function create. and the Request Body section concerns the body parameter we defined above in a dict such as type, role, and emailAddress.

8. Store locally a file from your Drive from a file ID

8.1 Store locally a file that is not a Google Type (e.g., Spreadsheet, Docs, Slides, etc…)

try: 
request_file = service.files().get_media(fileId="file_id")
file = io.BytesIO()
downloader = MediaIoBaseDownload(file, request_file)
done = False
while done is False:
status, done = downloader.next_chunk()
print(F'Download {int(status.progress() * 100)}.')
except HttpError as error:
print(F'An error occurred: {error}')

When the type of file we want to retrieve from the Drive is not a Google Type, it does not seem to be supported by the native library, so we have to build the query and execute it thanks to the MediaIoBaseDownload object.
If the code executes well you can retrieve your file as a string through the variable file. You can write it into the format you want thanks to the code below.

file_retrieved: str = file.getvalue()
with open(f"downloaded_file.csv", 'wb') as f:
f.write(file_retrieved)

Here you can see that I specified the extension of the file in the name. The extension is .csv because the file I downloaded was a String containing all file information.

8.2 Store locally a file that is a Google Type (e.g., Spreadsheet, Docs, Slides, etc…)

request_file = service.files().export_media(fileId="file_id", mimeType='text/csv').execute()
with open(f"downloaded_file.csv", 'wb') as f:
f.write(request_file)

You may have noticed that we used the method export_media rather than the get_media in section 8.1. That’s simply because as the file is in a Google MIME type, it’s not recognizable to our computers, so we have to convert it to a format before downloading it. Again, you can find an exhaustive list made by Mozilla here of all MIME types. You can also find a list of all MIME types relatives to Google file types on their documentation.

9. Delete a file from your Drive from a file ID

service.files().delete(fileId='file_id').execute()

This one is pretty straightforward. Just be sure that you have all rights to delete your file before executing this command or this will not work (owner access).

10. Upload a file to your Drive

file_metadata = {'name': 'filename_on_the_drive.csv'}
media = MediaFileUpload('local_filepath/local_file_name.csv',
mimetype='text/csv')

file = service.files().create(body=file_metadata, media_body=media,
fields='id').execute()

Uploading a file on the Drive is pretty straightforward too. You can add in the file_metadata dict all we saw in Section 6. I chose a text/csv MIME type here but you could have chosen a Google-specific MIME type such as a spreadsheet (application/vnd.google-apps.spreadsheet). One more time, you can find an exhaustive list made by Mozilla here of all MIME types. You can also find a list of all MIME types relatives to Google file types on their documentation. If you want to use a Google-specific MIME type, you have to include the information in the metadata as you can see in the code below.

file_metadata = {'name': 'filename_on_the_drive.csv',
'mimeType': 'application/vnd.google-apps.spreadsheet'}

If you import a medium or large file (more than 5MB), you may want to perform a resumable upload. This means that the upload will be less sensitive to network interruptions. Here is the documentation about all upload types. If you want to do this, just add the resumable=True into the parameter as shown below.

media = MediaFileUpload('local_filepath/local_file_name.csv',
mimetype='text/csv',
resumable=True)

Conclusion

Hope you enjoyed this tutorial and will enjoy doing some preprocessing with Python directly into your Google Drive.
The Google Workspace suite allows you to do a lot of things with their APIs, and you can use most of them in Python, so do not hesitate to explore them and find new automation ideas!

And if you want more tutorials, check out my other ones!

--

--