Building projections is fun, especially when the numbers hit. But when things go wrong? Let’s just call it a different kind of fun — the kind where you take your hands off the keyboard and throw them up in the air because none of the numbers land as expected. Texas Tech vs. Michigan State was that kind of fun.

March Madness occasionally turns out an outlier game of epic proportions (just ask Virginia last year), and the Spartans vs. the Red Raiders offered a leading candidate for this year’s wacky anomaly. …


Note: special thanks to co-author Steve Sandmeyer aka Sandy

Welcome to game 5 as we as we continue our analysis of the World Series using Google Cloud. After an epic comeback in game 4, the Boston Red Sox have taken a commanding 3–1 series lead and are one win away from their fourth World Series Championship the last 14 years.

Boston will give the ball to left-hander David Price, who earned the victory in game two by going 6 innings, allowing just 3 hits, 2 earned runs and striking out 5. …


Note: special thanks to co-author Steve Sandmeyer aka Sandy

We’ve reached game four of the Fall Classic as we continue our analysis of the World Series using Google Cloud. The Dodgers are within a game in a 2–1 series after an epic game 3 win in a tidy 7 hours, 20 minutes and 18 innings of play. If it seems like the game just ended a half hour ago, it’s probably because it did. Boston is still in the driver’s seat — and by winning the next two games in L.A. — can claim another crown to join the 1903…


Note: special thanks to co-author Steve Sandmeyer aka Sandy

Game 3 is upon us as we continue our analysis of the World Series using Google Cloud. Can the Dodgers make a series out of it? Or will the Red Sox win at least 2 of 3 in L.A. this weekend to clinch their fourth World Series title in the last 14 years? Pitching matchups are always a good place to start.

NOTE: You can hack on similar data in BiqQuery by heading over to our public datasets hosted on Google Cloud. …


Note: special thanks to co-author Steve Sandmeyer aka Sandy

Welcome to Game Two of the Fall Classic. Baseball can be a funny game. Prior to the series starting, everyone got into a lather (including us) over the “epic” game one pitching matchup between Clayton Kershaw (LAD) and Chris Sale (BOS). The result? Neither ace got out of the fifth inning and a total of 12 runs were scored as Boston opened the series with an 8–4 victory.

In this post we continue our analysis of the World Series using Google Cloud with a focus on hitting and a little tool…


We’re back! After months of slogging through basketball and the World Cup we are finally able to hack on some baseball. This year we are focusing more on pitch selection, sequencing and effectiveness. For our analysis, we are using data primarily from Sportradar as well as various public sources to drive our overall data science pipeline using Google Cloud.

Some of the tooling has changed since we started in 2016, but the pipeline fundamentals remain the same — leaning heavily on Apache Beam and Cloud Dataflow for ETL, BigQuery for interactive analysis and feature development, and then iPython via Deep…


Authored by: Ramzi BenSaid

In our previous post, we covered the data we collected and the architecture we built for our smart basketball court at Google Cloud NEXT ’18. This post explores the flow of that raw data from Google Cloud Storage into a more manageable one-entry-per-shot state in BigQuery, and some ensuing analysis. Our goal was to gain better insight into human shooting mechanics in order to better understand how they influence shots — specifically, what goes into a good jump shot. …


With the World Cup behind us and baseball meandering its way through late summer, you’d be forgiven for thinking that we had reached a lull in the Google Cloud sports data analysis universe. If you did, you’ll be pleased to know that our recent Google Cloud NEXT conference proved quite the opposite. Our showcase highlighting Google Cloud’s NCAA partnership demonstrated a new way to make the cloud more tangible than ever: by designing a basketball competition using elements of data science and predictive analytics implemented on a half-court in the middle of San Francisco’s Moscone Center. Naturally.

Some context: applied data science

This past year…


Authored by: Steve Sandmeyer

After a 40-year run by the Jules Rimet Trophy from 1930 to 1970, the current World Cup Trophy was created in 1971 and was first awarded in 1974 to West Germany. It was formed from 18 carat gold and is worth more than $10m USD in today’s market. It currently sits under tight security at the World Football Museum in Zurich. It will be handed to the winning team only after their name is engraved on it — then after the official presentation — will be returned to FIFA who will keep it locked up until…


One more match to go in our analysis of the World Cup using Google Cloud. Prior to and throughout the World Cup, we’ve talked through a lot of analysis workflows. We talked about data coverage, how we use GCP to to manage all data, turn it into predictive features and finally model it. In the end, our models fall into one of four categories.

  1. Team based features
  2. Player based features
  3. A blend of 1 & 2
  4. ELO

The blended version combining team and player based features performed the best on our test cases, but we thought it’d be interesting to…

Eric Schmidt

Developer Advocate @ Google Cloud

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store