GumGum speaks at Spark + AI Summit 2020

Rashmina Menon
Jul 27 · 2 min read
Image for post
Image for post

GumGum receives around 30 billion programmatic inventory impressions amounting to 25 TB of data each day. Inventory impression is the real estate to show potential ads on a publisher page. By generating near-real-time inventory forecast based on campaign-specific targeting rules, we enable the account managers to set up successful future campaigns. This talk, Real-Time Forecasting at Scale using Delta Lake and Delta Caching, which Jatinder Assi and I presented at Spark + AI Summit 2020, highlights the data pipelines and architecture that help us achieve a forecast response time of less than 30 seconds for this scale. Spark jobs efficiently sample the inventory impressions using AMIND sampling and write to Delta Lake. We talk about how we enable time series forecasting with zero downtime for end-users using auto ARIMA and sinusoids that capture the trends in the inventory data, and discuss about AMIND sampling, Delta Lake, Databricks Delta caching, and time series forecasting.

Spark + AI Summit 2020 — Real-Time Forecasting at Scale using Delta Lake and Delta Caching

We also discuss the details around this solution in two tech blogs here:

We’re always looking for new talent! View jobs.

Follow us: Facebook | Twitter | | Linkedin | Instagram

gumgum-tech

Thoughts from the GumGum tech team

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store