Properly pickle out to a path in Python when using Google Colab

Published in

CodeX

4 min readApr 2, 2021

On your journey in learning machine/deep learning using Google Colab, at some point you will need to save out your trained model so it can be used at a later date. In the world of Python this is known as “pickling” (check out the official Python documentation of a detailed description: https://docs.python.org/3/library/pickle.html).

If you’re using Google Colab and you want the pickled objects to be saved in your Google Drive you need to specify the exact path where you’d like the model to be saved.

Allow me to demonstrate…

Let’s look at an example code by YouTuber sentdex: Loading in your own data — Deep Learning basics with Python, TensorFlow and Keras (https://pythonprogramming.net/loading-custom-data-deep-learning-python-tensorflow-keras/?completed=/introduction-deep-learning-python-tensorflow-keras/)

At the very bottom of the tutorial the following code is shown:

If we were to follow this verbatim Python would create two pickled objects: X.pickle and y.pickle but where would it be saved in Colab?

First and foremost, mount your Google Drive by clicking on the left most bottom icon then clicking the right icon:

Click on ‘CONNECT TO GOOGLE DRIVE’:

Side note:

I created a new folder in my Google Drive called ‘ugly images’ which is in the same location as the Dog and Cat folders:

Then I created this small function to move across any images caught by the exception clause; any ugly images which were present in the Dog and Cat folders. This helped clean out the images in the Dog and Cat folders.

I also created a counter variable so see how many times this loop executed for the Dog and Cat folders:

Don’t forget to include import shutil at the top in the import statements:

Be warned: this script took over 1 hour to execute initially, so be patient. Subsequent executions only less than 2 minutes.

Back to the main article:

Let’s run the code to see where it places the pickled files:

Execution of this ‘cell code’ results in the X.pickle and y.pickle appearing in the root directory. Note: it is NOT placed in your Google Drive that has been mounted.

So what can we do?

Option 1: Manually move the pickle files into a dataset directory in Google Drive by dragging and dropping:

This method does work:

Option 2: Best method: Specify the path to the directory in our code when we pickle:

Right mouse click on the folder you would like to place the pickle files and click on ‘Copy path’:

Scroll down to the bottom of the Python script where it states import pickle. Create a new variable called ‘path’ and paste the value copied to the clipboard. Ensure there is a forward slash at the end of the path (very important, otherwise it doesn’t work!). In the two pickle_out statements include the ‘path +’

Runtime -> Run all:

Lo and behold, the pickled files appear exactly where I want them:

In the final ‘code cell’ declare the path variable and assign it to the datasets location as shown. Include the ‘path +’ in the two pickle_in statements (as above). When this ‘code cell’ is executed on it’s own it doesn’t show any error messages meaning it ran seamlessly:

Happy Colab pickling your trained objects…

Any questions or comments, feel free to leave in the response section…

Properly pickle out to a path in Python when using Google Colab

Written by Mitesh Parmar