Data Science and the Pursuit of the 6 Ideal States
By Yao H. Morin, Chief Data Officer
I like to joke that I am one of the original data nerds — I was into it before data was cool. In grad school, I studied operations research and convex optimization, which back then was basically a more clinical way of saying “data science.” My studies and the professional experience that followed — jobs with a law firm, the Department of Defense and a handful of top tech companies — proved to be the foundation of how I analyzed and valued data. This is why I jumped at the chance to work at StubHub. Here, data is a product.
To me, data is as close to the pure truth as we can get. Data is neutral, unbiased and unmotivated by hidden agendas or ill intentions. Data contains an infinite number of fascinating insights just waiting to be discovered.
As a company, we have data from 48 countries for over 10 million live sports, music and theater events. To capture, analyze, and leverage all of this data is no small feat.
We have the talent, the infrastructure and the tech to continue to tap into this fortune and extract its maximum potential. Doing so allows StubHub to deliver a seamlessly personalized experience to every customer.
The personalization of data means that every person who engages with the site will be uniquely distinguished from everyone else. One uniform strategy won’t work for all customers — as we know, every customer is unique. For instance, I am a food show fanatic who likes Dungeons & Dragons, Magic the Gatherings, alt rock and YouTube food shows. When it comes to events, I love the Minnesota Vikings and musicals.
With this kind of data, StubHub will eventually be able to connect me with the experiences I’d like to see with more surgical precision. This is a far more complicated — and frankly, more thrilling — challenge that we face as a company.
As StubHub’s Chief Data Officer, my team of engineers, coders, data scientists, thinkers, daredevils, techies, brainiacs and obsessives are working in unison to achieve the “ideal state.”
In the ideal state, data is available and instantly accessible across all applications, services and data consumers. Specifically, we are targeting six ideal states:
1. A singular source of truth
Processed and enriched data is captured in this single source of truth. It serves as the foundation of all data activities, ensures consistency across various data activities, and minimizes redundant data cleansing and extraction overhead.
2. Growing data sources
As our business and products continue to grow and become increasingly data-driven, the varieties and volume of data we collect will also continue to grow. Our system can scale seamlessly to support this growth.
3. Instant accessibility
Reports and dashboards are on time and ready. They are updated in near-real time as new information comes in. Self-serve tools are in place for basic reporting and analysis. Analysts focus on ad-hoc, in-depth analytics to provide insightful business intelligence and product analysis.
4. Infinite capacity and scalability
The data system — including data storage, pipeline, analytic platform and machine learning platform — is cloud-based and scalable based on needs and usage.
5. A work environment that fosters independence
Data experts can accomplish most of their tasks without dependency on data engineers and/or application engineer.
6. Transform every StubHub team member into a data consumer
Data is in the foundation of how StubHub makes business decisions, creates new product features, and optimizes business operations. Each team and employee is educated on how to analyze and implement data effectively and accurately.
We’ve already begun to achieve these ideal states. We are transitioning our tech stack to the cloud, which will modernize our data infrastructure and help us unlock our full potential in using our data. And we’re working on at least ten data products that range from personalization and recommendation engines to self-serve reporting tools.
By positioning data as its own product, we can create a world where a single recommendation engine is leveraged by all parts of a user’s experience. In doing so, it ensures a single source of truth, creates a recommendation platform that is more readily scalable, and gives StubHub the opportunity to grow our data/data sources (not to mention products and services) in a centralized way.
StubHub will be at the forefront of data technology. Our people believe this, are capable of making it happen, and know that with our data we can deliver better experiences and bigger results to our customers.