Benchmarking Tensorflow Performance and Cost Across Different GPU Options

Vincent Chu
Apr 20, 2017 · 4 min read

Machine learning practitioners— from students to professionals — understand the value of moving their work to GPUs . Without one, certain tasks simply become unfeasible for lack of computing power. However, the available options are confusing. Do I build my own? Rent one from a vendor? How much real-world performance and cost should I expect?

With that in mind, I benchmarked several GPU options against a realistic GPU-heavy workload, implemented in Tensorflow. I chose Amazon’s base GPU instance and GPU instances from Paperspace, an Initialized Capital portfolio company. I also added my own personal Nvidia GPU and my laptop CPU:

The benchmark itself was relatively simple. I installed Tensorflow v1.0 on each machine, and fine-tuned the Inception v4 model from an existing checkpoint. Performance, expressed in minibatches per second, was monitored after each run was started (using this script).

Results

Unsurprisingly, the results demonstrate how poorly suited CPUs are to compute-heavy machine learning tasks, even a relatively new Macbook Pro. Choosing any GPU, regardless of performance, improves performance by over an order of magnitude (43x for Amazon’s p2.xlarge, 167x for Nvidia’s 1080 Ti).

Not surprising: Nvidia’s recently released 1080 Ti performs extremely well, especially at its $699 price point. Surprising: How bad Amazon’s p2.xlarge offering is against its competitors.

Interestingly, Amazon’s Tesla K80-based p2 instances are clearly showing their age in performance and cost. Paperspace’s base GPU+ is 15% more performant at less than half the cost while its P5000 clocks in at 3x the performance at two-thirds the cost.

Of course, the clear performance winner is Nvidia’s recently-released 1080 Ti GPU. However, you’d have to build your own machine and bear the fixed cost (roughly $2,500) of buying the components. It’s not for everybody, especially those who anticipate only occasional GPU usage.

Cost

Unlike non-GPU instances, users pay a steep premium for GPU compute time; it’s important, therefore, to get a good sense for what your costs will be for a typical task.

Building your own machine around a GPU is clearly the most economical choice for heavy users; costs are essentially fixed and can be amortized over the life of the machine. Adjusting for performance, the build cost of a DIY machine can be recouped in just under 30 days of continuous usage (assuming a build cost of ~$2,500).

Paperspace is an economical choice for those who do not want or need to invest in their own hardware.

For those who are reluctant to commit to building their own machine, Amazon is clearly no longer an economical option for single-GPU workloads. Not only are their GPUs old and underpowered, their costs are relatively high. Adjusting for cost, Paperspace’s GPU+ instance type is about 2.7x cheaper than Amazon, while its P5000 is a clear winner, both in performance and cost.

What Should I Get?

Here is my take on what people ought to get, depending on their use case:

Update (4/21/2017): Folks on r/MachineLearning have pointed out that Amazon spot instances have been running around $0.20/hour, which changes the economics of running on Amazon. While this price seems to be a relatively recent development, it may be worth exploring for your application.

Building something interesting? Initialized Capital would love to chat with you.

Initialized Capital

Stories and thoughts on early-stage entrepreneurship from a venture capital firm. We fund and help startups with their first check. https://initialized.com/our-startups

Thanks to Kim-Mai Cutler.

Vincent Chu

Written by

Current: Partner at @initializedcap. Former: @ClaraLending, @twitter, @posterous. Studied Physics and Mathematics at Stanford and Harvard.

Initialized Capital

Stories and thoughts on early-stage entrepreneurship from a venture capital firm. We fund and help startups with their first check. https://initialized.com/our-startups