How does DataLoader work in PyTorch?
Why use DataLoader?
Because you don’t want to implement your own mini batch code each time. And since you’re gonna write up some wrapper for it anyway, the guys at FAIR thought they’d just do it for you to save you the trouble. Also it’s standardized so anyone can figure out how you prepare your data easily when they see your code. And I think this wrapper they’ve come up with is pretty good.
How it works
Basically the DataLoader works with the Dataset object. So to use the DataLoader you need to get your data into this Dataset wrapper. To do this you only need to implement two magic methods: __getitem__ and __len__. The __getitem__ takes an index and returns a tuple of (x, y) pair. The __len__ is just your usual __len__ that returns the size of the data. And that’s that.
Super easy sample code
Here’s a snippet of some super easy sample code, just for you to get an rough idea of how this works.
from scipy import misc
import torch
from torch.utils.data import Dataset, DataLoaderclass SomeImageDataset(Dataset):
"""The training table dataset.
"""
def __init__(self, x_path):
x_filenames = glob(x_path + '*.png') # Get the filenames of all training images
self.x_data = [torch.from_numpy(misc.imread(filename)) for filename in x_filenames] # Load the images into torch tensors
self.y_data = target_label_list # Class labels self.len = len(self.x_data) # Size of data
def __getitem__(self, index):
return self.x_data[index], self.y_data[index]
def __len__(self):
return self.len
To load this data, just use the DataLoader:
dataset = SomeImageDataset(x_path)train_loader = DataLoader(dataset=dataset,
batch_size=32,
shuffle=True,
num_workers=2
)for epoch in range(num_epochs): for i, data in enumerate(train_loader):
x_imgs, labels = data
# Do whatever you want with the data...
I have to confess that for a long time I didn’t want to learn how to use DataLoader because I was super lazy, and every time I needed to use mini batch and I just copied and pasted it from my old code.
So honestly, I’m just writing this for my old self and those who are as lazy as him.
Peace!

