Contextual Brand Safety Cover picture
Contextual Brand Safety Cover picture

Contextual brand safety is an ongoing series. This is the second blog in this series. Through this series, we talk about steps to be taken to do multi-label text classification in the industry. This blog post talks about model training and evaluation.

1. Introduction

Brand safety is an important offering of GumGum. Contextual Brand Safety-I talks about the problem and data preprocessing techniques in depth. In this blog post, we will discuss model training, evaluation and steps to production.

2. Experimental setup

We set up a multi-step mlflow project tracking system to track and store artifacts across each step i.e,

  1. Data loading and preprocessing (1…

Image for post
Image for post

Contextual brand safety is an ongoing series. This is the first blog in this series. Through this series, we talk about steps to be taken to do multi-label text classification in the industry. This blog post sets the stage by talking about the problem and data collection.

Introduction

GumGum is dedicated to ensuring a brand-safe environment to all our clients; advertisers and publishers alike. In order to implement brand safety, we have a variety of methods which assist in ensuring that we deliver ads on safe, relevant and high-quality content.

One of the most important measures taken to ensure brand safety is Publisher vetting. We ensure that each new content under every publisher is validated against our brand safety guidelines, where the publisher content is filtered in such a way that it does not consist of the…


Image for post
Image for post

Be it customer reviews, news articles or conversations between people, when we are tasked with the ordeal of having to figure out what the corpus is about, it is impossible to manually read and summarize them. Topic modeling is a natural language processing technique that extracts latent topics from a corpus of documents. Unlike a classification problem, there are no labels directing this process, hence it is unsupervised. There are many algorithms that perform topic modeling. The most important ones are:

  1. Latent Dirichlet Allocation (LDA)
  2. Non Negative Matrix Factorization (NMF)

In this blog, we will restrict our discussion to topic modeling using LDA. …

About

Sooraj Subrahmannian

NLP Data scientist@GumGum, Masters in Data Science, IIT Madras Alumnus, LinkedIn: linkedin.com/in/soorajms/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store