Tensorflow 2.0 comes with a set of pre-defined ready to use datasets. It is quite easy to use and is often handy when you are just playing around with new models.
In this short post, I will show you how you can use a pre-defined Tensorflow Dataset.
Make sure that you have
pip install -q tensorflow-datasets tensorflow
Using a Tensorflow dataset
In this example, we will use a small
imagenette | TensorFlow Datasets
Imagenette is a subset of 10 easily classified classes from the Imagenet dataset. It was originally prepared by Jeremy…
You can visit this link to get a complete list of available datasets.
Load the dataset
We will use the
tfds.builder function to load the dataset.
- we are setting
trueso that we can perform some manipulations on the data.
- we are creating an
imagenette_infoobject that contains the information about the dataset. It prints something like this:
Get split size
We can get the size of the train and validation set using the
This would be useful while defining the
validation_steps of the model.
Next, we will create batches so that the data is easily trainable. On low RAM devices or for large datasets it is usually not possible to load the whole dataset in memory at once.
Note: We are taking the train and validation splits and resizing all images to
448 x 448 . You can perform any other manipulation too using the
map function. It is useful to resize or normalize the image or perform any other preprocessing step.
That’s it. You can now use this data for your model. Here’s the link to the Google Colab with the complete code.