Norman Heckscher
1 min readMay 29, 2017

--

Great summary. Thanks for sharing. It’s a similar path to the one I’ve been down recently and I’m currently in the middle of benchmarking.

Instead of buying new gear, I sourced and reused old data centre hardware and purchased two new 1080 8gb GPUs. My spend has been higher at approx $US2600, however, I’ve been able to get Xeon 2600s that allow the two GPUs to fully utilise their x16 PCIe slot. Pipelining the data into the GPUs is certainly a challenge. For datasets that don’t reside in memory (or require dynamic augmentation) my first lot of numbers indicate that my GPUs are spending 20% of their time idle. As I’ve only just reached this point in the last few days, and I’ve only had the chance to test a couple of models, this idle time will undoubtedly increase/decrease with code optimisation and different datasets/models.

What has become very clear to me is that deep learning is not just about tensors flowing through the model, it’s also about the tensors flowing through the hardware.

--

--