Sports Analytics — Generating Actionable Insights using Cricket Commentary

Aravind Pai
Analytics Vidhya
Published in
11 min readFeb 24, 2020

Overview

1. What is sports analytics? What are the different use cases of sports analytics? We answer these questions here

2. Understand how sports analytics can impact sports like cricket, football, and tennis

3. We’ll work on a fascinating sports analytics use case — analyzing India’s performance using cricket commentary data!

Introduction

The scope of professional sports has changed over the years. I remember watching every minute of the 2003 Cricket World Cup and spending every waking minute tracking statistics, like the total runs scored, highest run-scorer, highest run rate, and so on.

It was fairly rudimentary stuff but enough to keep me glued to the screen. How times have changed since then!

Sports analytics is quickly becoming mainstream. Media outlets and leading sports websites regularly curate statistics, produce deep technical insights, and add a whole new level of analysis we haven’t seen before.

We can now answer questions like the below ones with a high degree of confidence:

  • Sports analytics in cricket: What is the probability of the team winning a match while batting first and second?
  • Sports analytics in football: What is the expected outcome of a shot taken from the left side of the penalty area?
  • Sports analytics in tennis: Where should you place your services based on the return statistics of your opponent

And so on. Honestly, the sky is the limit when it comes to sports analytics use cases. I’m a sports lover and I’m always looking out for applications where I can apply my analytics and machine learning knowledge to improve the team strategy as well as fan experience.

I’ll introduce you to the awe-inspiring world of sports analytics in this article. We will look at the different types of sports analytics, why this field is important, and we’ll also work on a use case of sports analytics — analyzing cricket commentary to generate insights.

Table of Contents

1. What is Sports Analytics?

2. Importance of Sports Analytics

3. Sports Analytics Case Study: Analyzing Cricket Commentary

  • Team Performance
  • Batting Performance
  • Bowling Performance
  • Boundary Analysis

What is Sports Analytics?

Sports Analytics is all about analyzing and extracting useful insights from sports data

I would broadly dive Sports Analytics into 2 categories:

  1. Descriptive Sports Analytics
  2. Predictive Sports Analytics

Let us discuss each category here.

Descriptive Sports Analytics

Descriptive Sports Analytics is about summarizing the sports data in the form of numbers. In other words, to come up with important statistics. This might sound like a simple concept but it’s a very powerful one.

The thought behind descriptive sports analytics plays a crucial role in team tactics.

Let’s take cricket for example. Here, we can analyze how frequently a batsman gets out to a specific bowler. This number will decide the bowling strategy of a team.

Here is an awesome video that analyzes the dismissals of Virat Kohli against Adam Zampa:

This is the reason why Adam Zampa was brought back into the attack whenever Virat Kohli was at the crease during Australia’s tour of India in 2020. In this series, Virat Kohli lost his wicket to Zampa in two out of three matches!

Another interesting use case in cricket is to analyze the team’s probability of winning a match while batting first as well as second. This influences the captain who wins the toss and has to make a decision — bat or bowl first.

Predictive Sports Analytics

Predictive Sports Analytics is about making predictions using sports data. One such use case in cricket is to predict the number of runs a batsman scores against an opponent in a particular match. This would help the team management and captain select the best team for every match.

In a sport like a football, predictive sports analytics helps to understand the chances of scoring a goal from any location on the pitch.

You can think of similar use cases for your favorite sport and let me know in the comments section below the article.

Importance of Sports Analytics

Sports Analytics is a game-changer — there’s no other way to put it. Using analytics in sports directly impacts the decision making of a team and can alter the future of the franchise or club (or country). It can easily change the outcome of the match.

Sports Analytics can be a Game Changer

There is a lot of scope for analytics in sports. In this section, I am going to discuss a few use cases of analytics in different sports, like cricket, football, and tennis.

Sports Analytics in Cricket

In cricket, we can analyze the strong and weak zone of a player. This would help the opponent and player understand the strengths and weaknesses of how he plays.

  • Opponents can develop a strategy to bowl against a player (like Adam Zampa against Virat Kohli)
  • The player can invest more time on his weakness to improve his game

Here is an awesome video that showcases the weak zone of Virat Kohli:

Sports Analytics in Football

The footballing world has been slow to adopt analytics but it’s quickly gathering pace now. We’re seeing the mainstream media using analytics numbers, such as expected goals and expected assists to analyze players and matches.

You should definitely keep an eye out on the Expected Goals (xG) metric. xG basically tells us the probability of a shot converting into a goal. This varies from player to player and from what position the shot is being taken. It’s quite a fascinating concept and you can read more about it here.

Another example of analytics in football is analyzing team formation while the match is going on. This would help the opponent to understand the team strategy and play according to it.

Sports Analytics in Tennis

In tennis, we can identify the combination of shots a player usually plays to win a point. This can be of great use to prepare a strategy against the opponent as well.

I’m sure you must have seen the statistics that come up on screen after the end of each set at a tennis Grand Slam. Features like the number of first-serve returned, the placement of the serve, the bounce of the service and where the opponent picked it up — these are all examples of sports analytics in tennis.

Sports Analytics Case Study: Analyzing Cricket Commentary

Let’s take up a real-world case study now to understand how sports analytics works. I am going to delve into my personal passion, cricket, for this case study.

I’m an avid follower of text commentary in cricket. An insightful commentator describes the events happening on the ground in good detail, right? There is a lot of online cricket commentary available on many sports websites like CricBuzz, ESPN Cricinfo, etc. This is a gold mine that can reveal many interesting and valuable insights into a team and player.

About the Dataset for Sports Analytics:

I have collected the commentary of the last 4 years of the T20 matches played by India. Download a sample dataset from here. It’s time to analyze the commentary and find some appealing insights. Let’s do it!

Team performance

After this section, you will be able to answer the below questions:

  1. What is the team average when batting first and second?
  2. How frequently does the Indian team win a match?
  3. What is the probability of India winning a match against a particular team?
  4. What is the target to be set by the Indian team to win the match?
  5. How many times has the Indian team defended a low scoring target?
  6. Which was the most successful year for Team India?

Ready? Let’s get our hands dirty now!

The total number of T20s India played in the last 4 years: 54

No. of T20s India played each year:

Inferences:

  • India has played the most number of T20s in 2018 and the least number of T20s in 2017 & 2019. This is because of the ICC Champions trophy’ 17 and ICC World Cup, 19 tournaments

Team Average (Batting First & Second):

Batting First Team Average: 180.0
Batting Second Team Average: 156.0

Team Average Innings wise over the years:

Inferences:

  1. The highest batting first average was close to 180 in 2018. On an average, India scored 180+ runs while batting first in 2018. So, we can infer that India had the best batting line-up during 2018
  2. Batting Second average is always less than the batting first average over all the years. From this, we can infer that the target set by the Indian team is higher than the opponents

Overall Winning % (Batting First & Second):

Over all Winning %      : 66.66 
Batting First Winning % : 59.0
Batting Second Winning %: 76.0

Winning % against different teams:

Inferences:

  1. India wins almost every match when playing against Bangladesh
  2. The team records the lowest winning percentage against New Zealand. India loses 60% of the matches since New Zealand are well known as the best spin playing team

Batting First Winning Score:

Inferences:

  1. Probability of India winning a match after scoring:
  • Less than 120 runs (<120) is around 0.33
  • Between 120 and 180 runs (120–180) is around 0.4
  • Greater than 180 runs (>180) is 0.75
  1. Glad to see that India has defended low scoring games too

Winning % over the years:

Inferences:

  1. Team performance has increased over the years and then drops down in 2019
  2. India records the highest winning percentage (82%) in 2018. The team won most of the matches played during 2018
  3. India lost half of the T20s played in 2019. Hence, the lowest winning percentage (50%) in 2019. One of the possible reasons could be due to the lack of senior players as the team opened up the doors for youngsters after ICC World Cup 2019

Batting performance

In this section, I will focus on the batting performance of team India in terms of the strike rate. We’ll also discuss how India’s performance has evolved over a period of time.

Strike rate can be defined as the average number of runs scored per 100 balls. The higher the strike rate, the better the batsman is.

Let’s find out the phases where team India can improve its batting.

The strike rate of Indian team is 138.66

Team batting strike rate over the years:

Inferences:

  • India had the highest batting striking rate in 2018. The batsmen were in great touch!

Team batting strike rate across different phases of a match:

Inferences:

  • The strike rate of the Indian team reaches around 150+ in the last 5 overs. And around 125+ in power play and middle overs

Team batting strike rate across different phases of a match over the years

Inferences:

  1. In 2018, India recorded the highest batting strike rate across all the 3 phases (Powerplay, middle overs, and the last 5 overs)
  2. The highest batting strike rate, close to 175, was recorded in 2018 during the last 5 overs. Reminds me of Dhoni & Hardik Pandya’s powerful hitting

Bowling performance

In this section, let’s unleash the bowling performance of team India in terms of Economy rate, Bowling Strike rate, and Bowling Average. And also how the performance has evolved over time.

  • Economy rate is defined as the average number of runs conceded per over
  • Bowling strike rate can be defined as the average number of balls conceded for a wicket
  • Bowling Average is the average number of runs conceded for a wicket

Its time to analyze the bowling performance of the Indian team.

Team India Economy rate in different phases of a match:

Inferences:

  • Indian bowlers concede around 7–8 runs per over. All credit to the Indian bowlers for such a healthy number!

Team India bowling performance across different phases of a match:

Inferences:

  1. On an average, Indian bowlers pick 2 wickets in the last 5 overs as bowlers concede around 13 balls for each wicket
  2. Indian bowlers concede around 27+ runs for a wicket in the middle overs which can be improved

Team bowling strike rate across different phases of a match over the years:

Inferences:

  1. Team India’s bowling performance was very poor in 2019 as the bowlers considered 33+ balls on an average to take a wicket in the middle overs
  2. India had the best death bowling attack in 2016

Boundary Analysis

In this section, we will be analyzing the average number of balls conceded by team India to score a boundary and also its evolution over the years.

Avg no. of balls to hit 4: 9 
Avg no. of balls to hit 6: 19

Avg number of balls to score boundary over the years:

Inferences:

  • Team India improved power hitting over the past few years. In 2019, the team cleared six for every 16 balls

Avg number of balls to score boundary across different phases of the match:

Inferences:

  1. India clears 4 for every over in the power play and the last 5 overs
  2. Team India smashes only 1 six in power play as the batsmen concede 24+ balls for a six

Avg number of balls to score 4 across different phases over the years:

Inferences:

  • In 2019, India conceded the most number of balls (around 14) to hit 4 during the middle overs

Avg number of balls to score 6 across different phases over the years:

Inferences:

  • Indian openers, middle order, and finishers improved the ability to clear six over the past years. That’s amazing!

End Notes

Unquestionably, Descriptive Sports Analytics has a far-reaching role to play in a team winning strategy compared to Predictive Sports Analytics. In this article, you have learned the importance of Sports Analytics and how analytics can impact different sports. We also analyzed team India’s performance over the past 4 years in T20 cricket.

Have fun implementing these ideas for your favorite sport!

--

--