MARKETWATCH PHOTO ILLUSTRATION/ISTOCKPHOTO, JOEBIDEN.COM

With the 2020 U.S. Presidential election approaching, talk of who Democratic nominee Joe Biden will tap as his VP candidate has intensified. Given the large volume of publisher inventory that we have access to for brand safety and contextual analysis, we decided to dig into the “Veepstakes” and analyze the frequency of mentions and dominant document sentiment among the most commonly listed potential running mate candidates.

For this analysis, we mined data from Verity, our Contextual Intelligence platform that leverages Computer Vision and Natural Language Processing, to track the mentions of Joe Biden + each of the potential running mates…


How you can optimize your CPU and GPU utilization

At GumGum, we use Computer Vision (CV) to leverage page visuals for our contextual targeting and brand suitability product called Verity. We process millions of images every hour, and at this rate, our long-term inference costs dwarf the upfront training costs. So, we tackled this issue head-on. In this post, I’ll benchmark and highlight the importance of multi-threading for I/O operations and batch processing for inference. Note that implementing these strategies may be an overkill if your application’s scale is of the order of a few thousand images an hour.

Bottlenecks in a Typical Inference Workflow

Let’s look at our application components:


With the advent of cloud and Serverless technologies the number of architectural options available in building systems have increased by many fold. I see many teams struggling to make decisions on whether to use serverless or containers based systems in the organization. My objective is to present you one real life case we encountered at GumGum and the thought process behind choosing containers over serverless.

Verity - Headless Content Extractor

GumGum takes pride in its contextual intelligence capabilities. Our proprietary contextual intelligence API Verity uses Natural Language Processing in combination with Computer Vision to derive context and brand safety information for a webpage. …


In the previous post we talked about using a multilabel classifier for threat classification in order to address problems when the threat in the image is not the salient object. In this post, we are going to discuss using Exploratory Data Analysis (EDA) for our multilabel dataset and its results. Since our model needs to analyze millions of images, keeping the inference times of our model low is of utmost priority for better scalability. In that regard, we also present results on using mixed precision training for our EfficientNet based threat classifier. Since this is a proprietary dataset, the specifics…


At GumGum, providing a brand safe environment for our advertisers is of utmost priority. In order to achieve this, the publisher’s inventory is scanned through to avoid ad misplacement. As CV scientists we build systems that can detect and classify threats if present in the publisher’s inventory, which could be images and/or videos. In order to detect and classify these threats, convolutional neural network based image classification algorithms are employed. A conventional multiclass image classifier can often times work well when an object under consideration is the only one in the image or occupies a large enough area of the…


Example frame with detections and confidences drawn

On May 28, 2019, myself (Greg Chu) and Corey Gale presented a talk titled “How GumGum Serves its CV at Scale” at the LA Computer Vision Meetup in GumGum’s Santa Monica office.

Abstract:

Abstract:

Given the rapidly growing utility of computer vision applications, how do we deploy these services in high-traffic production environments to generate business value? Here we present GumGum’s approach to building infrastructure for serving computer vision models in the cloud. We’ll also demo code for building a car make-model detection server.

Topics:

  • Multivitamin: an open-sourced Python framework for serving library-agnostic machine learning models
  • Containerization: packaging everything you need into…


A Presentation and Meetup discussion by Divyaa Ravichandran and Sanja Stegerer

Brand “unsafe” content in all shapes and sizes

About the authors:

Sanja Stegerer is an NLP Scientist and has been with GumGum, for 3 years now.

Divyaa Ravichandran has been a Computer Vision Scientist at GumGum for the past 2 years, and has been in the field for almost 3 years now.

As machine learning engineers, the CV and NLP teams in GumGum work towards improving GumGum’s existing CV and NLP capabilities, developing solutions for new advertising campaigns and maintaining code in a production environment.

We recently (15th May 2019) presented a Meetup talk regarding a product…


Written by Nishita Sant on August 7, 2018

I recently presented a talk on how GumGum implements a scalable computer vision solution in a cloud-only environment, at the annual Embedded Vision Summit, organized by the Embedded Vision Alliance in Santa Clara this May. Here is the abstract of my talk followed by the video of the presentation.

A growing number of applications utilize cloud computing for execution of computer vision algorithms. In this presentation, we introduce the basics of creating a cloud-based vision service, based on GumGum’s experience developing and deploying a computer vision-based service for enterprises. …


Written by Cambron Carter on June 5, 2018

AI on the Edge

GumGum recently hosted Dr. Genquan Stone Duan and Andrew Pierno of WiZR at our LA Computer Vision Meetup. This presentation is an interactive exploration of WiZR’s machine learning and computer vision infrastructure, which is used to provide real-time analytics for the purposes of security and surveillance. Anyone interested in embedded machine learning and computer vision should check out this technical deep-dive.

We’re always looking for new talent! View jobs.

Follow us: Facebook | Twitter | | Linkedin | Instagram


Written by Iris Fu and Divyaa Ravichandran, Computer Vision Scientists, on April 19, 2018

GumGum Sports processes enormous amounts of media each day. They come from a variety of sources and forms, including social media posts and broadcast/streaming videos. Our goal is to identify media that is relevant to our clients to estimate the value of their sponsorships and placements.

The challenge is in processing massive volumes of posts in a short period of time. For example, to estimate the value of each brand’s exposure in a basketball game, we would need to consider all of the available data and…

GumGum Tech Blog

Thoughts from the GumGum tech team

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store