How accurate is the DLS method? A Data Scientist’s take

ANGAD ARORA
6 min readMar 12, 2023

--

Someone who is a cricket fan knows exactly the pain we are talking here. Many international matches fate has been decided by this method. Thankfully today we have data sources that can help us verify the accuracy of this method. Let’s dive straight in:

Introduction to DLS Method

Game of Cricket in action

The Duckworth-Lewis-Stern (DLS) method is a mathematical system used to calculate target scores in cricket matches that are affected by rain, bad weather, or other interruptions. The DLS method was developed by two statisticians, Frank Duckworth and Tony Lewis, in the 1990s, and was later modified by Steven Stern in 2014.

The basic idea behind the DLS method is to calculate a revised target score for the team batting second in a limited-overs match , based on the number of overs they have to bat, and the number of wickets they have lost at the point of the interruption. The method takes into account the average run rate for the team batting first up to the point of the interruption, and adjusts the target score accordingly.

Based on this a par score sheet is created referred here

Motivation for this exercise

Being a huge cricket fan like many, I wanted to curiously find out what is its actual accuracy ? What are the ways we can improve this method ? What are the situations where DLS can be relied more on Vs where to question the DLS decision ? What team (batting 1st or 2nd) gets the most favor by this method?

Data Collection

Data used for this study is approximately 2500 International 50 over men’s ODI( One Day Internationals) played between 1995 and 2022 , available on https://cricsheet.org/. This data includes the teams involved, venue where the match was played, ball by ball history of outcomes. The players statistics data is used from espncricinfo data

Analysis

For the purpose of this study, I restricted my analysis to interpret the results for the team batting second. In typical ODI DLS method only start into effect when the 2nd team batting has been on the crease for atleast over 20 overs.

I used the team batting 2nd and started applying the DLS method after completion of over 20. So after end of every over after 20th over , I calculated the DLS par score and compared the score with current score and produced the winner according to DLS method. Since the match already had a winner, this gives me a direct measure of accuracy of DLS generated winner vs the actual winner . For example here is the sneak peak at the data

This table above shows the data for a particular match. Here is what each variable means:

Snapshot of data collected for DLS accuracy study

1: Target:- Is the team batting 2nd chasing to win the match

2: overs_completed:- Overs completed at the particular match instance.

3: Current_ Score: Score of the team batting 2nd at that match instance

4: Wickets_out: Number of wickets that the team batting 2nd has lost

5: DLS_Par_Score: Is the score coming from DLS method if the match is to be stopped at that instance

6: Output: If DLS part score> current score then win else loose and this result is compared with actual result for accuracy.

7: end_result: Actual match result for that particular ODI

8: accuracy: if output=end_result then accuracy is 1 otherwise 0

Likewise I collected ball by ball data for the 2500+ ODI matches completed so far and compared the DLS method accuracy Vs the actual output. Which brings me to the results.

Results

First lets see the accuracy breakdown by overs completed for the team batting second.

Accuracy of DLS method against the overs completed in second innings for a 50 over ODI

At the end of over number 20 we start at accuracy close to 72% for prediction for DLS .

To get a more holistic idea, Here is the accuracy breakdown by overs bowled and Wickets out in the second innings.

At over mark 20–24 with wickets out between 0–2 the accuracy is near 50–60% only!

Accuracy improves as more overs are bowled and more progression of the match , which is along the expected lines.

Does DLS favor batting or bowling team? Here is breakdown

ODI match instances summarized and split by DLS output and actual result

The yellow marked cases are misclassifications and we can see a whopping number of misclassifications are cases (>20%) where DLS predicts win for team batting second that in turn ends up losing in actual! Its safe to say that DLS is heavily skewed to favor team batting second!

Team strength impact on DLS accuracy:

DLS is a purely Statistical based approach and it doesn’t consider the relative strengths of the team playing. Hence to prove this hypothesis, I also collected individual player level match stats and combined all playing 11 data to get the overall playing team statistics. For Eg. The Indian team total batting average will be the batting average of all players in playing 11. I collected team attributes like overall batting average, bowling average and total average match experience.

This data led to some interesting findings as well.

The below graph shows statistical significance of the team batting average compared in 2cases

a. DLS predicts loss and team wins

b. DLS predicts loose and team loses

The batting average of team that was misclassified by DLS system was statistically higher than correctly classified cases.

Statistical comparison of batting averages of team misclassified by DLS vs teams correctly classified by DLS method
Statistical t test showing misclassified matches by DLS method have better team strength

Team strength is an important variable overlooked by DLS !

To Summarize:

DLS method has its flaws and we tried to look at them through data

  1. DLS has higher chances of error when match stops when team batting second has only batted for 20–30 overs.
  2. DLS method somehow ends up favoring the team batting second.
  3. Major misclassifications happen when one team has disproportionately higher strength in players statistics. (Eg Australian team in 2000s or Indian team in 2011–2015).

Caveats:

Although results look interesting I would point out few things that may alter changes to match course

  • When there is interruption, generally teams know about them and make game plan accordingly.
  • There are flip cases of revised targets by DLS method when interruptions happen in first half of the match and those cases where not considered a part of this study.

Future Work:

DLS is a statistical based approach. So for future methods we may need more robust method that can consider situations like

  • A case where the team has lost 4 or 5 wickets but the batting depth of the team batting second is narrow or huge , so there might be capable batsman batting down the order. Likewise for bowling average as well.
  • The overall ground conditions, eg in subcontinent there is more due on the grass making batting second easier
  • The match’s importance , e.g. the World cup final, will see players trying to do their absolute best.

--

--