Who should Manchester United sign? If not, Jadon Sancho

Kalpesh Ghadigaonkar
The Sports Scientist
4 min readSep 4, 2020

Manchester United are one of the hottest football team in the world, and among the top three when it comes to popularity. Coming out of last season, the club needs reinforcements in probably all departments.

Manchester United is in pursuit to acquire the service of Jadon Sancho. But, considering the pace of talks it is highly unlikely the club will sign Sancho this summer.

Perhaps it is time for Manchester United to move on to other targets. The new player must be a fast, reliable shooter and should have the ability to find the teammate in the goal.

I have looked at the players who have been linked to Manchester United in the past. The question is, which one is the best replacement for Jadon Sancho? I have done a comparative analysis between the following players and try to find out the answer to our question.

Players ~
Adama Traore
Dybala
Kingsley Coman
Jadon Sancho

My goal with this case study is to understand the importance of descriptive statistics.

We need to compare the players based on their performance from last season.
Quick stats on the number of goals and assists they have scored, I have just considered the League and European cup matches.

Adama Traore — G4, A9, M37, PS% 75.60
Dybala — G14, A8, M41, PS% 88.09
Kingsley Coman — G7, A4, M33, PS% 84
Jadon Sancho — G19, A18, M40, PS% 83.96

*G-Goals, A-Assists, M-Matches played, PS%-Pass Success percentage

Each of the three players (Adama Traore, Dybala, Kingsley Coman) has a mean, median, and mode rating of 7. Averages offer you only a one-dimensional view of your data. They tell you what the center of your data is, but that’s it. While this can be useful, it’s often not enough.

Each player has the same average rating, but there are clear differences between each data set. We need some other way of summarizing the data in addition to the average.

Here, frequency tells us the number of matches where the player got each rating.

Each player’s ratings are distributed differently, and if we can measure how the ratings are dispersed, we will be able to make a more informed decision. We can easily do this by calculating the range. The range tells us over how many numbers the data extends, to find the range, we take the largest number in the data set and then subtract the smallest.

Range~
Adama Traore — 4
Dybala — 3
Kingsley Coman — 3
Jadon Sancho — 4

The range only uses the smallest and the largest number in a set; The rest of the values are ignored. That could lead to a misleading result. The range can measure how far the values are spread out, but it’s difficult to get a real picture of how the data is distributed.

The main problem with the range is that, by definition, it includes outliers. If data has outliers, the range will include them, even though there may be only one or two extreme values. Excluding the outliers with the interquartile range means that we now have a way of comparing different sets of data without our results being distorted by outliers.

IQR ~
Adama Traore — 1
Dybala — 1
Kingsley Coman — 1
Jadon Sancho — 1.25

Our original problem with the range was that it’s extremely sensitive to outliers. To get around this, we divided the data into quarters, and we used the interquartile range to provide us with a cut-down range of the data.

We need to find the player whose ratings vary the least.

How can we more accurately measure variability? One way of achieving this is to look at how far away each value is from the mean, i.e., variance. The variance is a way of measuring spread, and it’s the average of the distance of values from the mean squared.

Statisticians use the variance a lot as a means of measuring the spread of data. The problem with the variance is that it can be quite difficult to think about the spread in terms of distances squared. The standard deviation is expressed in the same units as the mean is, whereas the variance is expressed in squared units.

S.D ~
Adama Traore — 0.93
Dybala — 0.96
Kingsley Coman — 0.82
Jadon Sancho — 1.15

Interesting? 🤔

If the standard deviation is high, this means that values are typically a long way from the mean. If the standard deviation is low, values tend to be close to the mean.

There is no second choice for Jadon Sancho when we compare the number of assists and goals he has scored.

All four players had an average rating of seven in the 19–20 season.

When we look at our analysis, all the other three players have a small standard deviation than Jadon Sancho in terms of ratings per match. Kingsley Coman stood out to be the most reliable one with a low standard deviation among all four. The values are a lot nearer to the mean and vary less. In the event that we decide to pick Kingsley Coman, we have a good idea of how well he is probably going to perform in each game.

Kingsley Coman has an average success pass percentage of 84% which is like 83.96% of Jadon Sancho.

Perhaps, Manchester United should sign Kingsley Coman based on the analysis and considering his performance in the Champions League final.

Yes, there are many other aspects of a football transfer but it was fun working on this case study. I am going to tweet this out to the club🤣

If you find this debatable, do let me know your thoughts in the comments.

Thank you!

source: whoscored

--

--