Of Playbooks and Notebooks

Elissa Lerner
Analyzing NCAA Basketball with GCP
3 min readMar 20, 2018

While BigQuery never gets tired of running datasets, the humans devising queries sometimes do. So for today, we’re going to take a peek behind the screen and show you the nuts and bolts behind the insights we built for the Cloud With Google campaign site.

In designing our playbook for the campaign, we knew we wanted to focus on the aspects of college basketball that are crowd favorites and demonstrate how they are ripe for data analysis and visualization. Given the breadth of data available through the NCAA, SportRadar, and other public datasets, we cast our net wide and thought about the different kinds of features we could analyze. We looked at trends across class makeup, geography, conference performance, jersey numbers, team colors, team mascots, and even the lunar calendar. We thought about the distinguishing features of college basketball — dunks, three-pointers, blocks, close games, tournament upsets, and the Final Four®. From there, it just was a matter of asking the right questions and bringing the the right datasets together.

For each of these insights, you can check out our step-by-step Cloud Datalab notebooks to see how we did it (best viewed on desktop).

Some gems we’re particularly fond of:

Coding and querying for full moons by emoji

Capturing 2017 block stats by class year in a single query

Capitalizing on the work of the query to create a color wheel of the Final Four with a single line of code

There’s plenty of other fun to be had with visualizing these queries. For instance, we put up the geographic map of three-point shooters in an interactive Datastudio dashboard over here:

And we brought in the full moon Final Four upsets data into a dashboard over here:

You can play out each of the queries within the notebooks below, or experiment with your own ideas by drawing on these queries and using the public datasets in BigQuery itself.

Happy querying!

--

--