Edit: As a few people have pointed out: “probably the biggest gotcha that is unique to DL/multi-GPU is to pay attention to the PCIe lanes supported by the CPU/motherboard” (by Andrej Karpathy). We want to have each GPU have 16 PCIe lanes so it eats data as fast as possible (16 GB/s for PCIe 3.0). This means that for two cards we need 32 PCIe lanes. However, the CPU I have picked has only 16 lanes. So 2 GPUs would run in 2x8 mode (instead of 2x16). This might be a bottleneck, leading to less than ideal utilization of the graphics cards. Thus a CPU with 40 lines is recommended.