At GumGum, we use Computer Vision (CV) to leverage page visuals for our contextual targeting and brand suitability product called Verity. We process millions of images every hour, and at this rate, our long-term inference costs dwarf the upfront training costs. So, we tackled this issue head-on. In this post, I’ll benchmark and highlight the importance of multi-threading for I/O operations and batch processing for inference. Note that implementing these strategies may be an overkill if your application’s scale is of the order of a few thousand images an hour.
Let’s look at our application components:
API. The API provides an interface between the client and the CV module. Minimally, a client request contains an image url and a task, e.g. “check whether the image depicts violence”. Here, we assume that the API performs at the desired scale and is not a bottleneck in our application. …