How to use TFRecord with Datasets and Iterators in Tensorflow with code samples

Prasad Pai
YML Innovation Lab
Published in
7 min readAug 7, 2018

In the previous article, I have demonstrated how to make use of Tensorflow’s Datasets and Iterators. There we had created Datasets directly from Numpy (or Tensors). Another methodology of creation of Dataset is with TFRecords. In this post, we will be exploring on what is a TFRecord, how to use it with Datasets and extract data with Iterators. We will venture into a very important issue but less documented topic of how to save images in TFRecord. We will also look into the common issue of TFRecord size getting bloated up.

What is a TFRecord?

TFRecord is an individual aggregated compact file summing up all the data (present in any format) required during training/testing of a model. This particular file can be transported across multiple systems and is also independent of the model on which it is going to be trained on. The TFRecord file may also contain additional overhead data required to reconstruct the original data which may not have been needed had we trained without TFRecord. Also, in case the dataset is extremely large, we may have to create multiple similar types of TFRecord files.

How to build a TFRecord? (In brief)

Any data in TFRecord has to be stored as either list of bytes or list of float or list of int64 only. Each of these data list entity created has to be wrapped by a Feature class. Next, each of the feature is stored in a key value pair with key corresponding to the title being allotted to each feature. These titles are going to be used later when extracting the data from TFRecord. The dictionary created is passed as input to Features class. Lastly, the features object is passed as input to Example class. Then this example class object is appended into the TFRecord. The above procedure is repeated for every type of data which has to be stored in TFRecord. The code to create TFRecord using simple data is given next.

Code to build TFRecord file using data structures of list and dictionary

Note: There is another type (manner of creation) of TFRecord. Details given later.

Create TFRecord for Images

Now that we have basic understanding on how to create a TFRecord for text type of data comprising of dictionaries and lists, let us proceed into adding images. Our toy dataset comprises of totally 10 images and two types of classes i.e cats and dogs. The dataset is a mixture of PNG and JPEG type of images. The images used are shown below:

Images in dataset

A very common approach to this problem is to convert the Numpy representation of these images into string and store them into TFRecord. As the format of data representation is changing, we have to store the overhead data of the image shape. Let’s look into code.

Create TFRecord of Images stored as string data

We have generated a file named as images.tfrecord. The size of that file is a stunning 20.3 MB, whereas if you sum up the size of individual image files in the dataset, it is only a partly 1.15 MB. This is perhaps one of the major reason why many people dislike TFRecord and many developers begin to stop using TFRecord.

Why TFRecord became so huge in memory?

We have to analyse this problem starting to look into shape of each of the image. For consideration, take a look into dog_2.jpg image. The shape of this image is (1414, 943, 3). Multiplying each of these dimensions, i.e. 1414 x 943 x 3 is equal to 4000206. So inside the Numpy representation (assuming datatype is uint8), the image is being represented by a total of 4000206 integer numbers. These 4000206 numbers are next stored sequentially in a binary string when we call the Numpy method of tostring().

In order to give a small illustration of the length of the string (though technically not correct), assume each of the uint8 number is greater than 100. So when this number is converted to string, it represents totally three characters. Then every number has to be separated by a delimiter, let’s say coma character(,) is being used. And totally there are 4000206 numbers in our Numpy example. So our expected length of string is:

4000206 x (3 char + 1 delimiter char) = 16000824 chars.

If one char is one byte, that translates to 15.25 MB. Now isn’t that memory https://github.com/Prasad9/TFRecord_Images/tree/master/Imagesconsuming?

How to overcome TFRecord size problem?

Let’s look into another property of image: the storage size of image. In my experience, many of the images used in training are generally small in storage size which varies mostly from few KBs to 100s of KB regardless of image shape. Hence, let us store the bytes of images directly into TFRecord. Tensorflow has provided us tf.gfile.FastGFile class which can read images in bytes format. Let us look into the revised code:

Create TFRecord of Images stored as bytes

Now the size of our images.tfrecord file is 1.2 MB which is almost the same size of individual images summed up.

Reduce TFRecord size further

Now, let us try to bring the TFRecord size further down. PNG images tend to capture more information with sharper edge details. This comes at a cost of increased storage size of image. Converting to JPEG images will infinitesimally blur your image but it will reward you with measurable amount of storage size reduction. Tensorflow also provides you with amount of quality you wish to retain when you are doing the conversion. Let us look into relevant parts of the code.

Create TFRecord of Images stored as bytes and PNG images encoded as JPEG

Do not follow this step, if alpha channel (i.e A of RGBA) is a part of input to your model or if your model really needs extra sharp details in the image.

Now, maintaining 100 percent of encoding quality, we have reduced the earlier TFRecord file of 1.2 MB to 579.5 KB. But if we reduce the quality marginally to 95 percent, we can reduce the TFRecord file size dramatically to 402.4 KB. I have plotted the impact on the dataset images during quality change later on, but now I have tabulated the TFRecord file size changes below for different quality.

TFRecord file size as encoding quality is changed

Can we reduce further

Yes, we can further improve. However this time, we will not improve our code, but we will make use of an external image optimization tool. A good image optimization tool will reduce the size of your image with absolutely no degradation in quality of image. I always make use of Trimage. By using Trimage, I am tabulating the results obtained below:

TFRecord file size as encoding quality is changed

This particular step perhaps, should be used as a beginning step before we start doing the conversion into TFRecord.

Extracting data from TFRecord (In brief)

Now that our TFRecords are ready, it is time to send them into training pipeline. The first step is to initialize TFRecordDataset with all the TFRecord file paths. After that, we have to extract the various features present in the TFRecords. We specify the various keys used during TFRecord formation earlier in this step. If we know beforehand what is the number of items present in the list of bytes or float or int64 for each data record, we can make use of FixedLenFeature, or else, we make use of VarLenFeature class. Next, the API parse_single_example extracts a dictionary object of each data record. Let us look into the extraction procedure of the TFRecord created earlier with simple text dictionary data.

Code to extract TFRecord data made with list and dictionary

Extract Images from TFRecord

We extend the same concept of extraction of simple TFRecord files to extract images from it as well. With the help of tf.image.decode_image API, we can decode the image present in any format. As a precautionary measure, we verify whether the shape of the decoded image matches with the stored overhead data of rows, cols and channels in TFRecord. Let us dive into the code of extraction of images from TFRecord.

Code to extract images from TFRecord stored as bytes

Results

Lastly, let us display the results on how the images are obtained based on the format of image and conversions made.

This is the set of output for JPEG and PNG images without PNG images getting converted into JPEG.

Input and output JPEG, PNG images

This is the set of output for PNG images when they are converted and quality of conversion is maintained at 100%, 95% and 90% respectively. However notice that transparent sections of input PNG image has been converted into black color.

PNG image converted to JPEG at 100%, 95% and 90% image quality

Do take a look at my detailed post on how to use Tensorflow’s Dataset and Iterators which is like the part 1 of this article.

You can also check the code used in this article from my Github repository.

While explaining how to build a TFRecord, I had explained in a very brief manner, as Thomas Gamauf has already explained how to create them in a better manner step by step. Do take a look into his article as well.

Do leave your valuable comments on what do you feel about this article. Also, let me know if I can improve further on any of the aspects I have mentioned.

--

--

Prasad Pai
YML Innovation Lab

Software developer @ Flipkart. An aspiring data scientist moving ahead with one step at a time.