How you can optimize your CPU and GPU utilization

Image for post
Image for post

At GumGum, we use Computer Vision (CV) to leverage page visuals for our contextual targeting and brand suitability product called Verity. We process millions of images every hour, and at this rate, our long-term inference costs dwarf the upfront training costs. So, we tackled this issue head-on. In this post, I’ll benchmark and highlight the importance of multi-threading for I/O operations and batch processing for inference. Note that implementing these strategies may be an overkill if your application’s scale is of the order of a few thousand images an hour.

Let’s look at our application components:


Rashad Moarref

Software Engineer with entrepreneurial spirit. Passionate about building Machine Learning applications at scale. PhD in ECE, Univ. Minnesota. Caltech Alumnus.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store