When training a neural net model, time is of the essence. This is why different machine configurations including GPUs, TPUs and multiple servers are utilized. I’ve been exploring the YouTube-8M project for the last couple months and there are previous posts about the project, the video dataset, the algorithms and…