Member-only story
Making Sense of Big Data
Deep Learning on Supercomputers
Hands-on about how to scale a Deep Learning application in the BSC’s CTE-Power cluster
This post will be used as documentation in the PATC course Introduction to Big Data Analytics at BSC
In a previous post, we demonstrated that supercomputers are a key component of the progress of Artificial Intelligence and what drove changes in effective compute over the last years was the increased parallelization and distribution of the algorithms.
This post will demonstrate how these supercomputers can be used; specifically, the BSC’s CTE-POWER cluster, in that each server has two CPUs IBM Power9 and four NVIDIA V100 GPUs.
In this series of posts, we will use the TensorFlow framework; however, the code in PyTorch code doesn’t differ too much. We will use the Keras API because since the release of Tensorflow 2.0,
tf.keras.Model
API has become the primary way of building neural networks, particularly those not requiring custom training loops.