Data Science at Seismic

Colin Jemmott
Seismic Innovation Labs
2 min readJun 1, 2018

--

About Seismic

Welcome to the Seismic data science blog!

Seismic is the leading sales enablement platform. There is a pretty good description of what that means on our website, but the bottom line is that for many large companies the majority of marketing content never gets used — but at the same time sales people are constantly saying they can’t find what they need! We fix that.

What We Build

Essentially all of Seismic’s data science effort is developing customer facing features — primarily helping sales people find the best marketing content more quickly. To do this we use supervised and unsupervised machine learning, reinforcement learning, recommendations, and natural language processing. We also automatically label images, summarize text, predict content effectiveness, and much more.

One of the unique aspects of data science at Seismic is that we are a fully multi-tenant environment, meaning each customer’s data is isolated. This also means that when we want to deploy a new machine learning model, for example, we are actually deploying huge numbers of them in parallel, one for each customer. The underlying infrastructure is also globally distributed for compliance reasons.

How We Build It

The data science and data engineering groups at Seismic are tightly coupled — in fact it is really one group called “data analytics”. Seismic’s data engineers develop and maintain our real-time data streams as well as SQL and NoSQL data warehouses. We could go on and on about how our data engineering team builds a reliable, scalable, and maintainable data ingestion pipeline, but this is a data science blog so we wanted to stay focused on that side of the team.

Seismic’s data science research usually starts as Python in Jupyter notebooks. Production data science code is typically deployed as Docker containers to Kubernetes clusters. Obviously we are big fans of open source tools, but we also leverage proprietary tools — for example, our interactive customer-facing dashboard is built with Microsoft’s Power BI.

This Blog

While there are a ton of really good data science blogs, we have found a few things at Seismic we will be sharing in upcoming posts, including:

  • Benefits and challenges of B2B SaaS data science (small data, multi-tenancy, etc.)
  • Power BI tips, tricks and tutorials
  • The surprising difficulty of shipping machine learning at scale
  • The unreasonable effectiveness of simplistic approaches

Follow us to get updates!

--

--

Colin Jemmott
Seismic Innovation Labs

I am a data scientist at Seismic Software and lecturer in the Halıcıoğlu Data Science Institute at UC San Diego. http://www.cjemmott.com/