Picking March Madness Winners Using Prefect

Robert
The Prefect Blog
Published in
5 min readMar 15, 2023

I grew up loving college basketball and the magic that is March Madness. The excitement of seeing buzzer-beaters and Cinderella stories make the NCAA men’s and women’s basketball tournaments two of the year's greatest sporting events. It is absolute chaos, and I love every minute of it.

I’m invited to several March Madness bracket pools every year because of my love of college basketball (and how much I talk about it). To be honest, I’m not very good at them because navigating March Madness is chaos. As an engineer, I needed to add some logic to this situation to help me figure it out and at least give myself a shot at winning these bracket pools. As Ted Mosby from How I Met Your Mother once said: “This is not March Madness…”

This is March Meticulously thought-outness

And what better tool than Prefect to help me orchestrate my first go at adding some logic to my decision process. The custom predictive algorithm is simplistic, but by gosh, it is orchestrated and observable thanks to Prefect.

All of my Prefect flow code and infrastructure configuration can be found at the Github link here.

Note: This is just code for the Men’s tournament, but changing the RapidAPI ID’s to those for the Women’s tournament will yield similar predictions.

The Code

Using Prefect flows, subflows, and task decorators, I took the following steps to help me pick the winners of the brackets.

  1. I gathered the statistics for the 68 teams in the tournament. I used RapidAPI’s Basketball API to gather data for each team. I was able to securely store the API Key using a Prefect secret block.
  2. I then picked the winners for each team's first set of games. I used the statistics from each team to calculate a score using win percentage, points scored, and points allowed. Those scores were compared to determine a winner for each of the first individual matchups.
  3. Then, I ranked the predicted first game winners. After calculating the score, I used Python native sort and lambdas to sort the teams by score. I then used these rankings to fill out the rest of my bracket. These rankings were printed to the screen using Prefect’s log_prints configuration.

The Infrastructure

The following infrastructure was put together using Prefect deployments and blocks.

  • GitHub for Version Control
  • S3 for Prefect flow code storage, connected with a Prefect S3 storage block.
  • EKS Cluster created using eksctl.
  • ECR to store the image that the job will use to run the Prefect flow.
  • Prefect KubernetesJob to define the job configuration for the flow run.

All of these components were combined through Prefect deployments and blocks. Instructions for configuring all of these components can be found in the GitHub repository above.

While these flows can be run locally or on any infrastructure, I chose to run this flow using EKS because of how flexible Kubernetes is with scaling resources and applying updates. Prefect’s integration with Kubernetes is completely customizable, and running these deployments on scalable infrastructure allows me to share this flow with everyone that has access to my workspace, without them having to worry about spinning up new infrastructure. It also provides the flexibility to scale when running more complex models for next year's tournament!

The Results

The following results were outputted from the flow code. The algorithm used to calculate these results is SUPER simple, so take these results with a grain of salt. I look forward to adding more data science and machine learning concepts to next year’s tournament.

11:23:13.032 | INFO    | Flow run 'lavender-asp' - 1. Houston
11:23:13.033 | INFO | Flow run 'lavender-asp' - 2. Florida Atlantic
11:23:13.034 | INFO | Flow run 'lavender-asp' - 3. Charleston
11:23:13.036 | INFO | Flow run 'lavender-asp' - 4. Oral Roberts
11:23:13.040 | INFO | Flow run 'lavender-asp' - 5. Gonzaga
11:23:13.041 | INFO | Flow run 'lavender-asp' - 6. Alabama
11:23:13.042 | INFO | Flow run 'lavender-asp' - 7. UCLA
11:23:13.043 | INFO | Flow run 'lavender-asp' - 8. Purdue
11:23:13.043 | INFO | Flow run 'lavender-asp' - 9. Arizona
11:23:13.044 | INFO | Flow run 'lavender-asp' - 10. Kent State
11:23:13.044 | INFO | Flow run 'lavender-asp' - 11. Marquette
11:23:13.045 | INFO | Flow run 'lavender-asp' - 12. Iona
11:23:13.045 | INFO | Flow run 'lavender-asp' - 13. Drake
11:23:13.046 | INFO | Flow run 'lavender-asp' - 14. Furman
11:23:13.047 | INFO | Flow run 'lavender-asp' - 15. St. Marys (CA)
11:23:13.047 | INFO | Flow run 'lavender-asp' - 16. Louisiana Lafayette
11:23:13.048 | INFO | Flow run 'lavender-asp' - 17. Texas
11:23:13.049 | INFO | Flow run 'lavender-asp' - 18. Kansas
11:23:13.052 | INFO | Flow run 'lavender-asp' - 19. UC Santa Barbara
11:23:13.052 | INFO | Flow run 'lavender-asp' - 20. Utah State
11:23:13.052 | INFO | Flow run 'lavender-asp' - 21. Kennesaw State
11:23:13.053 | INFO | Flow run 'lavender-asp' - 22. Montana State
11:23:13.053 | INFO | Flow run 'lavender-asp' - 23. Boise State
11:23:13.054 | INFO | Flow run 'lavender-asp' - 24. Texas A&M
11:23:13.054 | INFO | Flow run 'lavender-asp' - 25. Texas A&M-CC
11:23:13.054 | INFO | Flow run 'lavender-asp' - 26. NC State
11:23:13.055 | INFO | Flow run 'lavender-asp' - 27. Nevada
11:23:13.055 | INFO | Flow run 'lavender-asp' - 28. USC
11:23:13.055 | INFO | Flow run 'lavender-asp' - 29. Kentucky
11:23:13.056 | INFO | Flow run 'lavender-asp' - 30. Pittsburgh
11:23:13.056 | INFO | Flow run 'lavender-asp' - 31. TCU
11:23:13.057 | INFO | Flow run 'lavender-asp' - 32. Maryland
11:23:13.058 | INFO | Flow run 'lavender-asp' - 33. Illinois
11:23:13.058 | INFO | Flow run 'lavender-asp' - 34. Auburn
11:23:13.059 | INFO | Flow run 'lavender-asp' - 35. Iowa State
11:23:13.059 | INFO | Flow run 'lavender-asp' - 36. Fairleigh Dickinson

I will submit these results to an ESPN bracket, provide an update on how the bracket does, and predict the next Sweet 16 and Elite 8 matchups using Prefect’s flow deployment structure! Stay tuned.

Conclusion

All in all, college basketball is fun and chaotic. Filling out brackets is all a guessing game, but attempting to predict using simple math can help. One thing is certain: orchestrating and observing this math on your infrastructure using Prefect is the slam dunk your bracket needs.

--

--