GumGum Tech Blog
Published in

GumGum Tech Blog

Optimized Real-time Analytics using Spark Streaming and Apache Druid

Our advertising data engineering team at GumGum uses Spark Streaming and Apache Druid to provide real-time analytics to the business stakeholders for analyzing and measuring advertising business performance in near real-time.

Our biggest dataset is RTB (real-time bidding) auction logs which amounts to ~350,000 msg/sec during peak hours every day. It becomes crucial for the data team to leverage distributed computing systems like Apache Kafka, Spark Streaming and Apache Druid to process huge volumes of data…




We’re hiring! Check out

Recommended from Medium

[DevOps][Tips&Tricks]Oracle Cloud Instance using Podman to emulate Docker CLI

“Funslingers” Devblog #27 | Creating Ammo Refill Powerups in Unity

The Golden project in the future may be a huge, independent, constantly updated website with…

My Tech Reading List for 2019

Windows 10 in cloud configuration

Top 5 Programming Languages To Learn As A Beginner in 2021

Python Logo

What is an Apache Cloudstack?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jatinder Assi

Jatinder Assi

More from Medium

Spark Shines brighter with Project Tungsten

A quick look at Data Lake

How DataStax Enterprise Analytics Simplifies Migrating to DataStax Astra DB

Introduction to Modak’s Almaren Framework