Intro — Sports Intelligence @ DraftKings

Robin Mohseni
DraftKings Engineering
7 min readJul 14, 2023

Co-written by Ian Dorward, Prashant Singh and Robin Mohseni

At DraftKings, we have a proven track record of providing our customers with a world-class sports betting experience. We are proud to present our newly established Sports Intelligence team, aimed at redefining the sports betting landscape through innovation, expertise and cutting-edge technology. With a focus on delivering the best sports betting content in the industry, our dynamic team comprises two major units: Sports Data Engineering and Sports Data Science.

Sports Data Engineering: Laying the Foundation for Success

The Sports Data Engineering team plays a pivotal role within DraftKings’ Sports Intelligence division. This highly skilled team is responsible for creating and curating best-in-class data assets that serve as the building blocks for the advanced modelling capabilities of the Sports Data Science team.

Additionally, the Sports Data Engineering team excels in developing robust model replay features. This powerful tool enables the evaluation and optimization of sports models by simulating past events, thus assessing algorithm efficiency under varying conditions. By continuously improving DraftKings’ sports betting content, these endeavours enhance the overall customer experience.

Sports Data Science: Driving Innovation in Sports Modelling

The creation of the Sports Data Science team at DraftKings can be traced back to 2019, when it was originally formed by SBTech before the acquisition by DraftKings Inc. From the beginning, the team recognized the immense opportunities to revolutionise sports modelling capabilities across various sports, starting with the fundamentals. Initially focused on the US betting market, the team quickly embarked on rebuilding major sports, particularly football and basketball, and expanded to cover all major sports.

To provide a behind-the-scenes look at how sports data science is conducted at DraftKings, our team has been working on a series of articles. We will start with a short introduction discussing common techniques used and what we believe sets us apart from the competition.

Understanding “Sports Data Science”

Traditionally, teams like ours in the industry were referred to as ‘Quantitative Analysis’, ‘Quantitative Development’ or simply ‘Quants’, while ‘Data Science’ was focused on customer intelligence (recommender systems, customer segmentation and experimentation). The majority of Quants come from a very strong maths background, working with Trading desks on pricing models and various trading tools. In our experience, these teams used to build models in Excel, VBA, R and handover these to other engineering teams to run the models at scale. Often some of these legacy models had an Excel front-end (commonly referred to as RoboTraders). More recently, Quant teams have adopted more common object-oriented programming languages such as C# or Java to deliver more end-to-end but this Modelling-to-Engineering hand off for model productionisation still exists to a high degree.

At DraftKings, we consider ourselves to be a Data Science function. While we possess strong mathematical skills and collaborate closely with trading stakeholders, we have made a conscious effort to prioritise being a data-centric engineering team leveraging the latest machine learning techniques for sports modelling, capable of shipping machine learning products to Production.

The UK and Ireland are hotbeds for talent given the mature nature of the sports betting industry in Europe compared to the US, so the Sports Data Science team is based in London and Dublin. We have hired team members from a wide variety of backgrounds whilst also understanding the importance of domain expertise in this area, amassing decades of sports modelling experience within the group.

Photo from a team Golf Day in 2022

Engineering Technology Stack: Powering Innovation through C#, SQL, Kafka, AWS and Snowflake

Our Engineering team works with a wide ranging technology stack. We leverage programming languages such as C# and SQL to ingest and transform raw data and we use the open-source data streaming tool Kafka for ingesting real-time data feeds. Our cloud infrastructure is deployed on AWS with robust security measures in place and we persist the data in the Snowflake Data Warehouse.

Data Architecture

Data Science Technology Stack: Powering Innovation through Python and Kubernetes

In terms of our technology, the data science team works almost exclusively in Python. While we conduct extensive research and development in notebooks, both locally and on Databricks, our models are deployed as containerized microservices on Kubernetes. This approach ensures stability and resilience, which are crucial for high-frequency, real-time solutions like ours. Despite Python’s reputation for slower execution, we have worked on solutions over the years to meet SLAs for calculation speed.

Building Sports Models at DraftKings: The Monte-Carlo Approach

At DraftKings, the majority of our core models rely on Monte-Carlo simulation engines. These engines simulate entire games as stochastic processes, where each state transition is predicted using various machine learning models. The chosen unit for each state transition varies based on the sport, aiming to provide the most accurate and extensible products for our customers, whilst minimising computational latency and complexity. Our stateless calculations enable easy debugging and historical game replays.

Embracing MLOps: Achieving Best Practices

We follow MLOps best practices and leverage our ML Platform to facilitate seamless model deployment. Our ML models reside in a cloud-based ML model registry, exposed to calculation engines via APIs. During the inference stage, real-time data feeds and trader inputs are integrated into the calculation engines to generate the odds that our customers see. We constantly iterate on the ML models and simulation engines, leveraging new data sources and new modelling techniques.

Sports Data Science SDLC: Constant Improvement

Sports Data Science Software Development Lifecycle

The high-level calculation engine lifecycle is as follows, and includes iterative loops at many levels. It combines many of the traditional aspects of a data science SDLC, whilst also building in domain-specific steps.

  1. Work closely with Product and other stakeholders to clearly define the problem statement and requirements
  2. Carry out exploratory data analysis and ideation using data curated by the Sports Data Engineering team in our best-in-class Sports Data Warehouse
  3. Perform feature engineering, set up training pipelines and validate results
  4. Encode sport logic into play-by-play or time based simulations
  5. Integrate into our real-time streaming architecture
  6. Use simulation results to calculate probability distributions and markets
  7. Add margin to markets based upon configuration and margination algorithms
  8. Propagate markets to trading tools for manual inputs or odds adjustments and risk management

The diagram below broadly illustrates the general flow for calculation inference for our models. In reality, the feedback loops between our Traders and the models are more complex.

Simulation Flow

Integration with the Sportsbook: Real-Time Streaming Architecture

Our sports models, deployed on Kubernetes, offer HTTP endpoints for calculation execution. Simulation results are produced to Kafka topics, supporting calculations for both single markets and Same Game Parlays (SGPs). In future articles, we will delve into the intricacies of this architecture, including challenges such as odds jumping as a function of simulation variance.

Collaboration and Innovation: Delivering Exceptional Products

To bring our complex products to Production, our team collaborates with other talented engineering teams at DraftKings, including Sports Data, Markets and Sports Platform. Recently, we introduced live SGPs for NFL and NBA, showing our commitment to innovation and customer satisfaction.

Furthermore, our team supports algorithms for features such as margination and cashout. As there is a limited need for training data and machine learning in these cases, and the implementations of these calls hit the millions — the algorithms are provided to other engineering teams in the form of Nuget packages that were converted from Python to C#. These have recently been updated and we are now using C bindings via Python across the business.

Sports Intelligence Series: Sharing Deeper Insights

In the forthcoming Sports Intelligence series, our representatives will provide in-depth insights into the investments that we have made over the past year and what we are hoping to deliver in the future.

Sports Data Science will delve into the inner workings of our simulation engines, showcasing the fusion of traditional modelling techniques with modern machine learning and explaining how we leverage data assets to drive continuous improvements.

Sports Engineering will share insights into the creation of the best-in-class data assets that enhance the modelling capabilities of the Sports Data Science team and will explore the challenges of building a model replay system to enable robust testing of the sports engines.

Sports Intelligence: Revolutionising the Future of Sports Betting

DraftKings’ Sports Intelligence team is revolutionising the sports betting industry by blending expertise, cutting-edge technology and data-centric engineering. Through our Sports Data Science and Sports Data Engineering units, we deliver world-class sports betting content. By employing innovative techniques, such as Monte-Carlo simulations and MLOps best practices, we ensure accurate and extensible models. Our collaboration with other engineering teams and continuous drive for improvement enables us to deliver exceptional products and experiences to our valued customers.

We look forward to sharing deeper insights and exploring various aspects of our sports modelling journey shining a light on the innovation happening behind the scenes at DraftKings.

Stay tuned for future articles that will uncover the magic of our Monte-Carlo simulation engines, the intricacies of our machine learning platform, deep dives into the creation of our data assets, leveraging historical data for model verification and much more.

Want to learn more about DraftKings’ global Engineering team and culture? Check out our Engineer Spotlights and current openings!

--

--