Rebounds, Pace, Free Throws — Oh My!

Eric Schmidt
Analyzing NCAA Basketball with GCP
5 min readMar 22, 2018

Get ready for the showdown between Miss Rev and Biff: such cute mascots, but wow are their respective teams tough!

For the past two weeks, we (a few Google nerds) have been hacking on men’s and women’s NCAA basketball data, looking deeper into the madness of the respective tournaments. By using Cloud Dataflow, BigQuery, and Cloud Datalab to build descriptive and predictive insights, we’ve been able to study patterns at work that can inform how we’re watching the various matchups at play. For example, we’ve learned Texas A&M is extremely tough under the basket, while Michigan is superb at slowing down the opposing offense — and we’ll show you how we got there.

In this post, we look at offensive and defensive rebounds from the perspective of player contribution. The idea here is that the more players that are involved in producing a stat, the more effective the team will be at that function. We’ve talked before about team balance and player contribution with regard to assists and scoring, and today we’ll see how that thinking can apply to rebounds and pace. So instead of just looking at the number of rebounds a team is producing, we’ll look at the ratio between player contribution and the overall output of the team and the balance across players that are contributing.

Today’s tasks use several standard SQL primitives via BigQuery, and a bit of pandas magic.

Let’s dive into Texas A&M vs. Michigan looking at rebounds and much more.

Match Up

Texas A&M is known for their defense and tough play under the basket. They hold opponents to 40.2% shooting on the season (#11 in NCAA), while averaging 41.7 rebounds per game (#2 in NCAA), with 27.7 defensive rebounds per game (#3 in NCAA) along with 10.9 offensive rebounds per game (#20 in NCAA).

A key stat for the Aggies is their offensive rebounding percentage of 32.7% (#22 in NCAA), as they are not a strong offensive team. They average just 46% shooting overall (#94 in NCAA), which includes 33.1% from 3-point range (#265 in NCAA), meaning those rebounds are vital for second chance points.

The graph below compares Texas A&M to Michigan in offensive rebounding balance. You can see that Texas A&M is the more balanced offensive rebounding team, with two more players contributing to the team’s offensive rebound production. Each player contribution violin is the ranked % of offensive rebounds per game for players that had at least 1 offensive rebound (Texas A&M in white, Michigan in yellow).

Michigan’s strength is in slowing down the game. While keeping the game slow keeps their own possession count low (ranking #331 with an average of 67.3 possessions per game), it also keeps opposing offensive production down, allowing just 63.1 points per game (#6 in NCAA). Michigan also ranks #7 in the NCAA in both fewest opponent three-point makes (5.5 per game) and opponent three-point attempts (16.2 per game).

Fewer possessions mean fewer opportunities for both sides, which forces Michigan to take care of the ball better than just about anyone. They commit only 9.2 turnovers per game (#2 in NCAA). Moreover, the Wolverines are great at creating extra possessions despite subpar rebounding (33 per game, in the lower half of the NCAA). Instead, they’re exceptionally good at steals, averaging 4.1 per game (#3 in the NCAA).

You can see just how dominant Michigan is on the defending the three and dictating pace. The graph below illustrates the difference between Michigan’s opponent’s seven-game incoming average vs. what Michigan allowed in each game for three-point attempts, three-point makes, and possessions. That is impressive.

MICHIGAN 2017 — Allowed vs. Opponent’s Incoming 7 Game Average

We built the underlying data view for these graphs by using 5, 7, and 10 game sliding window team stats as well as allowed and opponent’s opponents stats. It’s a big query. Looking at opponent’s opponents to gauge how well a given team’s strengths or weaknesses matchup relative to the particular team they’re up against.

Synopsis

This will be a battle under the glass, and a battle for tempo. Each possession will be valuable and rebounds will be key for both teams.

But if this game comes down to free throws, look out. Both teams struggle from the line, to put it politely. Michigan shoots at 65.8% (#329 — out of 351! — in the NCAA), while Texas A&M shoots only slightly better at 66.4% (#320).

Slower pace and fewer turnovers will reduce opportunities. We estimate that there will be 70 rebounds combined in this contest (regular time only).

Appendix

Query for Offensive Rebound Balance

WITH

players AS (

SELECT

season,

game_id,

scheduled_date,

full_name,

height,

weight,

team_market,

minutes,

field_goals_made,

field_goals_att,

assists,

offensive_rebounds,

defensive_rebounds

FROM

`bigquery-public-data.ncaa_basketball.mbb_players_games_sr`

WHERE

minutes IS NOT NULL

AND minutes != “00:00” ),

games AS(

SELECT

season,

game_id,

scheduled_date,

market,

conf_name,

minutes,

field_goals_made,

field_goals_att,

assists,

offensive_rebounds,

defensive_rebounds,

opp_conf_name,

opp_offensive_rebounds,

opp_defensive_rebounds

FROM

`bigquery-public-data.ncaa_basketball.mbb_teams_games_sr`)

SELECT

*,

ROW_NUMBER() OVER(PARTITION BY game_id, team_market ORDER BY assist_to_team DESC ) AS player_assist_rank,

ROW_NUMBER() OVER(PARTITION BY game_id, team_market ORDER BY fgm_to_team DESC ) AS player_fgm_rank,

ROW_NUMBER() OVER(PARTITION BY game_id, team_market ORDER BY rebounds_to_team DESC ) AS player_rebound_rank,

ROW_NUMBER() OVER(PARTITION BY game_id, team_market ORDER BY def_rebounds_to_team DESC ) AS player_def_rebound_rank,

ROW_NUMBER() OVER(PARTITION BY game_id, team_market ORDER BY off_rebounds_to_team DESC ) AS player_off_rebound_rank

FROM (

SELECT

players.season,

players.game_id,

players.scheduled_date,

players.full_name,

players.height,

players.weight,

players.team_market,

SPLIT(players.minutes, “:”)[SAFE_OFFSET(0)] AS mins_played,

players.field_goals_made,

players.field_goals_att,

players.offensive_rebounds,

players.defensive_rebounds,

players.offensive_rebounds + players.defensive_rebounds AS total_rebounds,

IF(players.field_goals_att>0,

ROUND(players.field_goals_made/players.field_goals_att, 3),

0) AS fgp,

players.assists,

ROUND(players.field_goals_att/games.field_goals_att, 3) AS fga_to_team,

ROUND(players.field_goals_made/games.field_goals_made, 3) AS fgm_to_team,

ROUND(players.assists/games.assists, 3) AS assist_to_team,

ROUND(players.assists/games.field_goals_made, 3) AS assists_to_team_fgm,

games.defensive_rebounds AS defensive_rebounds_game,

games.offensive_rebounds AS offensive_rebounds_game,

ROUND((players.offensive_rebounds + players.defensive_rebounds)/ (games.offensive_rebounds + games.defensive_rebounds), 3) AS rebounds_to_team,

ROUND((players.offensive_rebounds)/ (games.offensive_rebounds), 3) AS off_rebounds_to_team,

ROUND((players.defensive_rebounds)/ (games.defensive_rebounds), 3) AS def_rebounds_to_team,

games.offensive_rebounds + games.defensive_rebounds AS total_rebounds_game,

games.offensive_rebounds — games.opp_offensive_rebounds AS team_off_rebounds_delta,

games.defensive_rebounds — games.opp_defensive_rebounds AS team_def_rebounds_delta,

(games.offensive_rebounds + games.defensive_rebounds) — (games.opp_offensive_rebounds + games.opp_defensive_rebounds) AS team_rebounds_delta,

games.assists AS assists_game,

games.field_goals_made AS fgm_game,

games.field_goals_att AS fga_game,

ROUND(games.field_goals_made /games.field_goals_att, 3) AS fgp_game,

(CAST(SPLIT(games.minutes, “:”)[SAFE_OFFSET(0)] AS INT64) * 60 + CAST(SPLIT(games.minutes, “:”)[SAFE_OFFSET(1)] AS INT64)) / 5 AS mins_game

FROM

players

LEFT JOIN

games

ON

players.game_id = games.game_id

AND players.team_market = games.market

WHERE

players.season = 2017

AND players.assists > 1

AND players.team_market IN (“Texas A&M”,

“Michigan”)

ORDER BY

season,

scheduled_date DESC)

--

--