How to speed up Pytorch training

3 min readJan 8, 2020

Training deep learning can be time-consuming.

Training a common ResNet-50 model using a single GPU on the ImageNet can take more than a week to complete. To save money and time, it is important to have the correct configuration and parameters.

The num_worker and pin_memory in dataloader can greatly affect the loading time of the data(https://zhuanlan.zhihu.com/p/39752167). The question is , how to easily get the optimal num_worker and pin_memory? I modify a script to help you testing out the optimal params automatically .

Taking the official ImageNet training in PyTorch as an example:

This is the main_worker function in main.py

def main_worker(gpu, ngpus_per_node, args):
 global best_acc1
 args.gpu = gpuif args.gpu is not None:
 print(“Use GPU: {} for training”.format(args.gpu))   ... train_loader = torch.utils.data.DataLoader(
 train_dataset, batch_size=args.batch_size, shuffle=(train_sampler is None),
 num_workers=args.workers, pin_memory=True, sampler=train_sampler)val_loader = torch.utils.data.DataLoader(
 datasets.ImageFolder(valdir, transforms.Compose([
 transforms.Resize(256),
 transforms.CenterCrop(224),
 transforms.ToTensor(),
 normalize,
 ])),
 batch_size=args.batch_size…

How to speed up Pytorch training

Written by Max Ng🔥