How do Men & Women Serves Differ in Pro Tennis
The serve has been called the most important shot in tennis. All points, after all, begin with the serve. While speed is the first thing that comes to mind when discussing the serve, there are many other things involved within a serve that can be measured, which can make all the difference between winning and loosing a match.
After spending half of my childhood playing and watching tennis, I decided to take a look at certain data points in relation to the serve. My goal was to discover how the Serve differs between Men & Women.
Using the tennis_slam_pointbypoint dataset scraped, and compiled by Jeff Sackmann I set out to observe and analyze serving trends for both Men & Women in professional tennis matches at all 4 grand slam tournaments from 2011–2020. All code and analysis can be found in my repo on github here.
After cleaning the dataset, and creating two different datasets for Men matches, and Women matches, the first thing I decided to look at is the first thing on everyone’s mind when thinking about serve’s — speed, particularly in km/h.
As we can see from the images above, serve speed for both Men and Women appear to be normally distributed. The red line vertical line in both plots indicate the mean serve speed, and the green dashed line indicates the median serve speed. To investigate the differences in serve speed further, I decided to break down the distributions into First serve speed, and second serve speed.
To compare these two distributions for first serve speed against one another I plotted a CDF, or a cumulative distribution function, which plots the serve speed against percentiles in the distribution. The code to create this CDF is listed below:
men_f_serve = np.array(men_matches[(men_matches[‘first_serve_in’] == 1) & (men_matches[‘serve_speed_kmh’] > 0)][‘serve_speed_kmh’])
men_f_serve = np.sort(men_f_serve)men_f_serve_y = np.arange(1, len(men_f_serve) + 1) / len(men_f_serve)women_f_serve = np.array(women_matches[(women_matches[‘first_serve_in’] == 1) &
(women_matches[‘serve_speed_kmh’] > 0)][‘serve_speed_kmh’])
women_f_serve = np.sort(women_f_serve)women_f_serve_y = np.arange(1, len(women_f_serve) + 1) / len(women_f_serve)plt.figure(figsize=(12, 5))
plt.plot(men_f_serve, men_f_serve_y, color='blue', label='Men', alpha=0.4)
plt.plot(women_f_serve, women_f_serve_y, color='orange', label='Women')
plt.xlabel('Serve Speed')
plt.ylabel('CDF')
plt.legend()
plt.title('CDF plot of First Serve Speed for Men vs. Women')plt.show()
We can tell from these plots that Men’s first serve is faster on average than Women’s first serve. Looking at second serves:
The second serve speed histogram and CDF plot show much of the same as first serves. On average Men are serving 27 km/h faster than Women on first serves, and 17 km/h faster on second serves.
To see whether this sample statistic was significant I performed a hypothesis test on the distribution of serve speed for Men and Women, with the Null hypothesis stated as Men and Women have the same average serve speed.
print('p-value for first serve speed: %0.3f' % (ttest_ind(men_f_serve_speed, women_f_serve_speed)[1]))print('p-value for second serve speed: %0.3f' % (ttest_ind(men_ss_serve_speed, women_ss_serve_speed)[1]))
The results for both separate tests give a p-value of 0.00 indicating that we can reject the null hypothesis, and show that there is statistical significance of a difference in serve speeds between Men & Women.
The next thing I looked at were First Serve %, Second Serve %, First Serve % Won, and Second Serve % Won. The plot below shows the results:
For first serves :
Women have a slightly higher first serve percentage, meaning they are hitting more first serves in play than Men — albeit by a very minimal amount — however, Men are winning between 7–8% more first serve points than Women.
For second serves:
Men have a higher second serve percentage, and a higher second serve winning percentage than Women.
What I found very interesting is that on average both Men and Women are winning less than 50% of their second serve points. This shows me that it’s extremely important to get your first serve in play, but also gives me some insight that if you’re able to win more than 50% of your second serves, you might have a better chance of winning — however that is something I would have test further in order to confirm. These averages show that there might be some truth behind the term “you’re only as good as your second serve.”
The next thing I wanted to look into was serve placement, both serve width and depth. I plotted some bar plots of serve width and serve depth averages for Men and Women serves in general:
The data shows that the most popular serve in Men and Women’s Tennis is the Down the T serve. About 30% of Men’s serves are down the T, and about 25% of Women’s serves are down the T. Male players also predominantly serve out wide with just over 25% of their serves being serves out wide. Women also like serving out wide with about 20% of their serves being serves out wide, but Women have a greater tendency to serve closer to the body when serving out wide or up the T. About 10% of Male serves are into the body which show that Men are attempting to pull their opponents out of position when serving more often then not. Women only serve about 14% into the body, but the numbers show that they have a more balance serving strategy, whereas Men usually prefer to go Down the T or out Wide with more than 50% of their serves.
Men also have a slightly higher rate of serving close to the line, with about 35% of their serves coming close to the line as opposed to around 30% for Women.
Following up on first and second serve speeds, I was curious to see how serve width effected speed for both Men and Women. Using seaborn I plotted out a facet grid with each serve width’s serve speed distribution as a histogram. The results below:
The plots above give some interesting information. First off, Down the T serves are the fastest serves by far for both Men and Women. A reason for this could be that the ball travels the least distance when it goes down the T, and perhaps looses less velocity because of that. Also because the Down the T serve is the most served area for both Men and Women, it’s possible that a majority of first serves are hit here, especially when trying to win a point off a unforced error or ace. On the flip side Body serves are the slowest. This could be due to a majority of second serves being hit into the body to try and jam the returner. For both Men and Women Wide serves start off with a higher portion of serves being faster than Body/ Center serves, however at about the 65th percentile rank Body/ Center serves start to overtake Wide serves with a higher amount of faster serves. This might be due to players attempting to hit big serves Down the T, but mis-aiming and as a result serves landing closer to the Body/ Center area.
Analyzing the serve wouldn’t be complete without looking at aces and double faults. When looking at aces I found that a little over 8% of Men’s points are aces, while Women have less than half of that at around 4%.
To further investigate aces, I was curious where a majority of aces were being hit for both Men and Women. The plot below shows very similar trends for Men and Women, as both Men and Women hit around 40% of their aces out Wide, close to the line. They also both hit over 35% of their aces Down the T close to the line. Both of these serves make up around 3/4 of the area where aces are hit for both Men and Women. So as a server when you need an ace going out Wide as close to the line as possible is your best bet.
When looking at double faults, it turns out that about 9% of Men second serves, and 14% of Women second serves result in a double fault. In total Men double fault 3.5% of the time while serving, compared to 4.81% of the time for Women. I decided to perform a hypothesis test on total double fault percentage in order to see whether the results of 3.5%, and 4.81% for Men and Women had statistical significance. My null hypothesis was that Men and Women total double fault percentage was equal. Here’s the code I used to perform this test:
men_dfs_ct = men_matches.groupby('double_fault')['match_id'].count().reset_index().set_index('double_fault').rename(
index={0: 'No', 1: 'Yes'}, columns={'match_id': 'Men'}).transpose()women_dfs_ct = women_matches.groupby('double_fault')['match_id'].count().reset_index().\
set_index('double_fault').rename(index={0: 'No', 1: 'Yes'}, columns={'match_id': 'Women'}).transpose()dfs_ct = men_dfs_ct.append(women_dfs_ct)print('Double Fault p-value: %0.3f' % (chi2_contingency(dfs_ct)[1]))
The hypothesis test yielded a p-value of 0.000, therefore allowing me to reject the null hypothesis and conclude that there is statistical significance behind my findings that Women serve on average more double faults than Men do. After performing a hypothesis test on the percentage of double faults, I was curious to see which serves were the greatest cause for double faults in terms of serve width.
The plot above once again shows some similarities, and some differences. Wide serves account for more than 30% of double faults for Men and Women, however Men serve about 22% of their double faults down the T, compared to only about 16–17% for Women. Men also double fault less than 15% of the time into the Body and Body/ Wide, compared to about 16%-17% double faults for Body serves, and Body/ Wide serves for Women. These numbers could be due to Women serving the ball long more often then Men. There are a number of factors that can be the reason for this, but one possibility could be that because Women serve speeds are slowest into the Body, and Body/ Wide as per the CDF plot above, there could be a lack of racket head speed which could create a lack of top spin, or “kick” on the ball. The less spin on the ball, the less likely the ball is to drop down into the court, thus potentially leading to a long serve.
The last thing I wanted to look at was server performance on breakpoints. The plot below shows the returners breakpoint winning percentage based on serve width and depth:
It seems that Men servers win about 70% of breakpoints when they serve out Wide close to the line, and Down the T close to the line. Women tend to loose over 50% of breakpoints when they serve Body/ Wide close to the line. Interestingly Men have the least chance of winning the point while down break point when serving into the Body close to the line. So if you’re down breakpoint the general strategy is to hit either out Wide or Down the T, and serve as close to the line as you can to have the best chance of winning the point.
In conclusion, it goes without saying that there are some key differences between Men serves and Women serves in Tennis. The most obvious difference being that Men serve faster on average than Women do. In addition, Men are serving about double the amount of aces as Women, and less double faults. With more data we can further investigate the differences between serves. I would be interested in looking more in depth into things like ball rotation to measure spin, and racket head speed to see what the correlation is, if any, between racket head speed and serve rotation.