This is the first part of our little post series where we explore how to build a Flask JSON API for uploading files to Google Cloud Storage. Today we are going to deal with some requirements, descisions, and setting up our basic app.
This is going to be a rather extensive tutorial. If you want to dive into the code along with reading: it’s available on Github.
Recently at work, I wanted to add a simple file upload to one of our Flask APIs. It took me a while to figure out how to do it. It seems like in addition to some special requirements we had, the available packages lack some easy to understand documentation. That’s why I want to share some insights on how you can integrate file uploading into your Python app.
So, this was our list of requirements to be covered:
- Upload files to Google Cloud Storage
- Customize the directory structure of the uploaded files and their names
- Upload an image, store the original version and two resized versions
- The solution should be easy to extend to allow uploading other file types than images (e.g. pdf, audio files)
- Integration in a Flask JSON API
- It should seamlessly work with SQLAlchemy as ORM
- File upload should be testable with automated tests (using pytest)
Choosing a package
First of all I had to decide which package I wanted to use for uploading images. A web search quickly made clear that there was not much information out there on how to upload files to Google Cloud storage. The most fitting packages were sqlalchemy_media and filedepot.
Neither of them (at the time of writing) provides a specific storage for uploading to Google Cloud Storage nor any documentation for how to use it with Google Cloud Storage. Yet, Google claims that migrating from Amazon S3 should be pretty easy and straight forward by doing the following:
- Changing the request endpoint of the lib you use to request the Cloud Storage endpoints
- Replacing the AWS access key and secret key with the corresponding Google developer keys
Sounds easy enough, right? Well, it turned out that it wasn’t that easy to replace the AWS authentication in sqlalchemy_media. It isn’t too hard to sub-class their nicely abstracted
Store class and implement you own
Nevertheless I had some issues with overwriting uploaded images, so I decided to give filedepot a try. Luckily, I got a basic prototype with filedepot working within a short time by using the Google developer keys instead of AWS keys and using the
So, I fully went in on filedepot and tried to figure out how to implement our needs.
Ready? Then let’s deep dive into filedepot-based image uploading!
Setting up filedepot
First you have to add filedepot and boto3 to your app’s
requirements.txt. Boto3 provides us with the interface to AWS or more important for us, Google Cloud Storage later on. If we didn’t install it filedepot will complain and we won’t be able to start our Flask app in production mode (with the
TESTING config set to
True you wouldn’t need it at this stage, though).
Then install the newly added dependencies by running
pip install -r requirements.txt
Or in case you use pipenv it’s as easy as
pipenv install filedepot boto3
One crucial requirement was to be able to test file uploads. So, next, we will look into how you can setup and configure filedepot for both your test environment and production environment.
I added a
config module to the Flask app and put a
depot.py file into it. That’s where your filedepot configuration and initialization will happen.
Let’s assume we have a
users blueprint with a
User model in the
models.py module that should get an avatar image. Besides, there is a
views.py where we will add our upload endpoint. We will look into our User class and upload route in detail later in this series.
Right now, this is what our basic app structure looks like:
│ └── config
│ │ ├── __init__.py
│ │ └── depot.py
│ ├── users
│ │ ├── __init__.py
│ │ ├── models.py
│ │ └── views.py
│ ├── __init__.py
│ ├── app.py
│ └── …
│ └── …
Our Flask app is implemented in an
App class in
app.py. So, let’s setup our depots in its constructor:
Let’s quickly add an
init_depots(app) function to our
depot.py to boostrap our setup:
Adding a test config
Filedepot comes with mutliple storages. There is a
S3Storage, and a
MemoryFileStorage. Since the
MemoryFileStorage stores the “uploaded” files in your local memory, it’s just perfect for our testing purposes.
In order to setup our test storage we initialize a
DepotManager with the respective config:
That’s it. With this we have an in-memory file depot running when we start our app.
However, this is not very helpful, if we want to upload our images to Google Cloud Storage. Thus, next, we will look into how to set up a production depot, while keeping the in-memory storage for our tests and development environment.
Adding a production config
As I briefly mentioned above, we can use
depot.io.boto3.S3Storage to connect to our Google Cloud Storage.
By default the
S3Storage class points to the AWS API endpoint, so we need to set its endpoint_url to the Google Cloud Storage API URL, that is
https://storage.googleapis.com. Additionally, we have to provide the authentication credentials and set the bucket where we want to store our uploaded images.
Let’s add the following config variables to our app’s config:
Next, we can add a production config to our
depot.py. Let’s also use either the test config or the production config based on whether the
TESTING config is set (the Flask app gives us a
.testing property that reads the
TESTING config value):
Now we use an in-memory storage if the
TESTING config is
True, else we will push our files to our configured bucket in Google Cloud Storage.