33 Followers
·
Follow

Importing data into Firestore using Python

A simple guide to using the Firebase Admin SDK with Python to import data from a csv file.

Image for post
Image for post

I’m a big fan of Firebase and have been using it to develop projects for a while, however I was struggling to find a simple way to import a large quantity of data into Firestore. The Realtime database has a nice import and export data functionality but currently no equivalent exists for Firestore, so I thought I’d use a Python script to achieve this.

Before we start, Im assuming you have a project already setup in Firebase and are using Firestore, if not you can find the guides for how to here.

Generating Service Account Key

To use the SDK we’ll need to have a Service Account Key. You can generate this by using Firebase Console > Project Settings:

Image for post
Image for post

Then selecting Service accounts and clicking Generate new private key

Image for post
Image for post

This will produce a .json file with the credentials needed to access the Firebase project.

IMPORTANT! Keep this confidential, don’t add it to version control or to a client side application.

Setting up Python project

Now that we have got the Service Account Key we can set up the Python project, if you are used to setting up python projects obviously do this in your preferred way, but just in case.

Assuming you’ve got a version of Python installed, I would recommend at least 3.4 and also using a virtualenv. Create a folder for the project and initialise the virtual environment using:

Once completed type the path to the activate script in the virtual environment this is located in env\scripts folder for example: .\env\Scripts\activate.bat

This starts the environment for use in the terminal you’ll see it’s name in front of the prompt i.e. (env)

Then install:

This provides firebase-admin package to allow the connections as well as the google-cloud-firestore package.

Move your copy of the Service Account Key to the working folder, and for ease I renamed mine to ServiceAccountKey.json. It’s probably a good time to add it to your .gitignore file, if using git, to avoid accidentally committing it.

Connecting to Firestore

You can now connect to the Firestore for your project, I’ll just demonstrate how to do this and get some data then go on to adding data imported from a csv file.

Add a main.py file to the folder with the below:

This reads from the collection users, limiting the returned results to just 2 as I didn’t want all of the users collection returned. The script then prints the data as a dictionary.

Now we can connect and get data, lets add some to Firestore, just by changing the main.py to below, we can add a collection called test and a new document with the attributes of name and added and their values.

Adding Data from csv to Firestore

Adding data from a csv file requires us to read the csv in and then group the data to be added into transactions of up to 500 as the Firestore batch writes function has a limit of this.

For this example I’m just using a flat data structure but you could expand upon this to allow for more complex structures to be added.

I’ve got a list of UK based breweries and want to add this to a new Firestore collection.

Image for post
Image for post

The file contains Name and County fields for each brewery and I’ve cleaned the data beforehand, so there are no empty values.

We can now update the main.py file to below, changing the CSV_FILE_PATH and COLLECTION_TO_ADD_TO variables to appropriate values, in my case Breweries.csv and breweries respectively.

Note: This script makes some assumptions to be aware of, the csv file’s first row are headers which are used to be the property names of the following rows i.e.

name,county
item1,item2

becomes { name: item1, county: item2}

We’ve also used the batch function from the Firestore client, this allows for set(), update() and delete() operations. Therefore to add a new document we must first create a new document (line 42) and then set the data for it in the batch (line 43).

The function batch_data is to restrict the operations per batch to the Firestore limit of 500, I used 499 to be safe.

Running the script now populates the Firestore collection with the values in the csv.

Image for post
Image for post

You can also retrieve the data from and compare it with the original data to check the process using a script similar to:

This script gets a count of the data rows, by changing the COLLECTION_TO_CHECK variable to the collection you want to check.

Conclusion

This is a simple guide to using the Firestore Admin SDK with Python to add and retrieve data, you can adapt these scripts to do more complex actions. You could also change the data format being added to one with sub collections or different variable types.

Hopefully this has been helpful, thanks for reading.

A big thank you to @ThatJenPerson and the excellent video guide which was a huge help.

Written by

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store