📌 To learn more about how to use Dash for AI/ML Applications in Sports Analytics, view our webinar recorded on April 21st, 2021 with Sebastian, Plotly’s Product Marketing Coordinator
Author: Sebastian Leighton Cooper
As a sports fan, can you imagine this moment?
It’s the bottom of the ninth, two outs, 3–2 count, the batter focuses as he wags his bat over the plate…
Countless hours and pure devotion by the athletes, coaches, and trainers lead up to the unfolding of these epic sports dramas.
Here’s a secret: the real heroes at the end of these contests… are often Data Scientists!
That secret is spreading more every year: If you want the trophy at the end of your season, you must leverage Data Science, Machine Learning, and Artificial Intelligence in your organization’s approach — you must grow beyond the “eye-test.” The Golden State Warriors, powerhouses of the 2010’s after 40 years of futility, built a numbers strategy that is being emulated across the league. The NFL even hosts $100k-prize Kaggle competitions! From attaching harnesses to Rugby players for analyzing positioning and mitigating injury to the first ever Sports Analytics academic major, experts, enthusiasts, and educators alike are learning to use cutting-edge tools to help their teams win and follow their favorite games.
Plotly’s Dash Enterprise data visualization platform makes it easy to build these tools. We’re arriving at the next frontier for both front-offices and practice floors. As we lean into Moneyball, let’s take a look at 7 apps covering Baseball, Soccer, Formula 1, and Basketball — all built with Dash that demonstrate just how innovative your team’s strategy can become:
- ⚾ MLB History Explorer
Statistics seem integrated into the (inter)national pastime. All eyes will be fixed on Hyun Jin Ryu of the Toronto Blue Jays and Gerrit Cole of the New York Yankees when they take the mound for the first game of the season — and their hits, strikeouts, and pitch counts. Many major league stadiums still hand out scorecards so fans can even keep track themselves!
Baseball features a mature (and at times dizzying) nomenclature and is likely the most mature of all sports analytics. In 2015, Major League Baseball (MLB) installed a state-of-the-art sensor and camera array called “Statcast”, collecting and analyzing a massive amount of data for front offices, broadcasters and fans. While vanguard tools like Statcast quantify the raw skills of players today, let’s take a look at baseball’s statistical history, dating back as far as the early 20th century.
Born of curiosity, this MLB history explorer uses historical statistics retrieved from Sean Lahman’s baseball database. Start by selecting between Team, Batter and Pitcher/Fielding Analysis and choosing a baseball era. Use the sliders to further focus your view dating from the first World Series in 1903 to the COVID-shortened 2020.
(Feel free to hum Take Me Out to the Ballgame to yourself.)
For this example, we’ll select, arguably, the most successful sports dynasty in history, the ‘19-’41 New York Yankees who won 9 titles during the span. Each team’s championships won are listed, followed by their win/loss performance within the set time range.
The team’s batting performance is measured using the team’s Batting Average on Balls in Play (BABIP) and Slugging average (SLG). On the same page, the fielding performance of the selected team is illustrated by stacking together the teams total Errors and Double Plays. We can see that peaks in their team batting stats corresponding strongly (but not perfectly) with their WS wins.
The Earned Run Average (ERA) has historically been one of the most common measurements of a team’s pitching staff, averaging the number of runs allowed by the team’s pitching staff over a 9 inning period while the Strikeout-to-walk Ratio K/BB displays the team’s rate of strikeouts for each walk. Examining these Yankees, it’s no surprise they were so often victorious with a minuscule team ERA of 3.2 in ’27 and a scorching K/BB ratio of 1.39 in ‘32.
While Babe Ruth or Ted Williams aren’t slouches, Jackie Roosevelt Robinson might be the most famous and iconic baseball player to have lived. Let’s look at his on-field contributions because not only was he a trailblazer, he was really good at baseball.
Select an era, then a team, then a player, and view their basic personal profile along with statistics and an individual analysis breaking down their On-Base-Percentage(OBP), Slugging Percentage(SLG), and On-base Plus Slugging (OPS) percentage is displayed. The OBP determines how frequently a batter reaches base per plate appearance. Slugging represents the total number of bases a player records per at-bat. OPS is a combination of a player’s On-base percentage (OBP) and Slugging average to illustrate how well a player can hit for average and for power.
Dizzy yet? Like we said, Analytics in Baseball is mature. To prevent the graphs from becoming too overwhelming, players’ At-bats were omitted as well as stats that weren’t official during the selected era (like the Sacrifice Fly).
While just about every athlete plays a position, except Designated Hitters (DH), it is important to consider a player’s contributions on the field. The third and final page of this application evaluates a player’s Fielding as well as Pitching (if applicable) ability. Life before, select an era, team, and player to see their defensive statistics. We’ll continue with Jackie Robinson’s tremendous impact at 2B.
For many, analyzing the Fielding percentage (FLD/FPCT) is the only way to determine a player’s value at a given position. This calculates a player’s tendency to make an expected out whenever they field, throw, or receive the ball without making an error. Now pitchers, while their FLD is important, are judged by another slate of robust record-keeping.
For pitchers, let’s examine another titan: Bob Gibson.
For this application, a player is evaluated as a pitcher using Walks And Hits Per Inning Pitched (WHIP), Winning Percentage (WPCT), along with their Earned Run Average (ERA) and Strikeout-to-walk Ratio (K/BB). As it suggests, the WHIP is a modern method of tracking a pitcher’s effectiveness in preventing base runners. Similar to the team pitching evaluation, the app illustrates the selected players ERA and K/BB. Gibson’s 1968 season leading to a World Series appearance and Cy Young award, may very well remain one of the best pitching performances ever!
From the first World Series to today, understanding how a player affects a teams’ performance is essential. The MLB History explorer will allow any nostalgic fan and analyst to discover their favorite team’s past and uncover the players that contributed to the team’s success!
2. ⚽ Match Data Analysis
From pitching to the pitch we go. Soccer is notoriously difficult to analyze being such a low-“event” sport, compared to, say, baseball, (goals are few, balls go out of play often, even stoppage time can seem unspecific) but the availability of tracking data and development of technical language has revolutionized the way that teams are able to focus on actionable insights. Watching the beautiful game on the telly rarely captures the whole story.
Douglas Hagey’s app analyzes the performance of any team’s collective movements and assists with the assessment of individual players. Dash Enterprise’s Snapshot Engine empowers users to capture, archive, and share point-in-time revelations — no matter where in the match they reveal themselves. Using such analysis (without the hindrance or distortion of simple video feedback), coaches and managers can recreate match action in order to evaluate unique activities including:
- Players’ individual on-ball activity and off-ball positions and movements
- Team formations (on attack and defense and offense)
- Selecting for whole team movement or portions of players
- Animating this tracking data to better understand game-flow, and more!
Knowing where an opposing team takes the majority of their shots could have a significant impact on your preparation for the match. Tendencies related to set plays, crosses, assists to shots, corner kicks, goal kicks, etc. gives a team a distinct edge. The below snapshot diagrams set plays and each player’s action (identified by jersey number).
One particular metric of interest is “progressive passes”. A recent series of articles from the Where Goals Come From Project details how progressive passes are key to scoring goals in a sport where goals are a relative rarity in comparison to other sports. As such, calculating and assessing progressive passes is key to understanding threats coming from the opposition as well as advantages inherent in your own team’s ability to produce these kinds of passes. Additionally, graphing progressive passes allows a team to visually assess whether there are distinct, habitual patterns and actions performed by opposition teams.
The use of highly targeted terms like “expected goals” (xG: the probability for any shot to turn into a goal or the cumulative probabilities of such events at the end of each match) and “passes per defensive action” (PPDA: the quantity of offensive passes before defensive challenges are made adjusted by number of possessions) has proliferated. They aren’t perfect indicators, but they greatly expand how scientists can help teams to study opposing teams, players, and goalies, assess their own team’s performance, fitness, and adherence to the match plan and even serve as a tool in player recruitment. Smart teams will capitalize on Data Science’s advancements in this realm.
The selected data, courtesy of Metrica Sports, really scratches the surface of just how revolutionary this data could become.
For a larger sample of what apps like this can do, check out Doug Hagey’s twitter for glimpses of how access to larger pools of data can be focused into serious advantages!
3. 🏎️ NASCAR Spoiler Design Optimization
Tech used: AeroBox
Link to app: https://dash-gallery.plotly.host/dash-airfoil-design/
Reach out for the source code!
The history of airfoils is an interesting one. In 1967, Lotus made its Formula One debut with the Lotus 49. In 1968, it became the first F1 car to use aerofoil wings to increase its traction navigating hairpin turns. In planes, airfoils add lift and reduce drag, producing flight! In cars, it’s actually the reverse: at high speeds, the increased drag and downward-force are trade-offs against pure speed to increase the tires’ grip on the road.
The future of airfoils is bright and getting faster! Peter Sharpe studies aircraft design, multidisciplinary design optimization (MDO), and applied aerodynamics in MIT’s Department of Aeronautics and Astronautics. He made this Dash app to help MIT student racing teams optimize their vehicle performance:
“I’m envisioning an app that lets automotive engineers interactively design spoiler airfoils to maximize downforce during turns. I occasionally advise automotive student teams here at MIT about aerodynamics, and questions about this topic always seem to come up.”
In case you’re not, yourself, a PhD in Automotive Physics and Design, below are some helpful hints to navigate this app:
Toggle the first salmon-colored modifier button on the left side to find a dropdown menu with interactive sliders.
- Angle of Attack rotates the airfoil relative to the oncoming air. A negative value here generates downforce like a car’s spoiler and positive values generate lift like an airplane’s wing.
- Height adjusts the vertical location of the airfoil compared to the ground.
- Ground Effect modifies the aerodynamics engine to treat the bottom boundary of the flow field as a “wall”. As an airfoil approaches this wall, lift can be massively altered — this effect is key in automotive aerodynamics.
- Streamline Density modifies the number of streamlines that are drawn. (As density increases, calculations become more complex, so be patient as the callback renders on the screen.)
Next, we can also directly modify the shape of the airfoil using Kulfan (CST) parameters (“class function/shape function transformation”: a popular classification method where a design’s physical features and geometries could be represented exactly by analytically smooth and consistent mathematical functions). An app like this cannot account for the full possibility of all design choices, so it was defaulted to three degrees of freedom for each side. However, adding back the complexity here actually requires changes to just one line of code.
Approximately speaking, on both the top surface and the bottom, the three Parameters correspond to: nose curvature, middle thickness, and trailing edge angle (in descending order). Adjust these levels and watch the shape reform itself.
Finally, a text output of the airfoil’s Raw Coordinates (using the *.dat file format convention that is universally used in aerospace design) is provided to the user, so that they can take airfoils they’ve designed in the app and use them in other applications.
Putting all these tools together, we can create everything from the airplane airfoils shown above to automotive airfoils in the following screenshot below. Throughout the entire design process, the engineering figure of merit (lift coefficient, or equivalently, downforce) is given in the bottom left!
Next, after all those numbers have been crunched, calculations made, and design choices executed, let’s examine what happens when the rubber meets the road…
4. 🏎️ Formula 1 Stats Explorer
I have to admit, Lewis Hamilton is my favorite Formula 1 driver. He is the first black driver to race in the sport. He’s cool. He’s suave. He’s even got nice hair. But is he good at racing?
Actually, Hamilton is arguably the greatest! Let’s see his and his all-time competitors’ stats with Chris Jeon’s awesome app using Dash Enterprise’s Design Kit!
Parsing data from the Ergast Developer API, this web application easily reproduces thousands of race results, driver and constructor rankings, up-to-date timetables, circuit layouts, and comparisons of Formula 1 seasons from 1950 to present. Ergast’s vast database makes it easy for motorsport fans to visualize and interact with F1 data.
This app goes full throttle and conveniently assembles information not only spanning decades but also scraping current news feeds as they update today!
Ever wondered how each driver accumulated points for each race in a given year? The Seasons page visualizes the points progression of any driver in any given year using Dash and a line chart produced with Plotly’s graphing library.
With Ergast’s vast database, you’re also able to access the biography and race results of every single Formula 1 driver that has competed in the motorsport since 1950. This page takes advantage of Dash’s DataTable components as well as information from Wikipedia.
While Formula 1 may seem like an individual sport, there is also a team/constructor aspect to it as well. This page shows the percentage of points earned by team in any given year.
Formula 1 offers a diverse range of circuits located everywhere across the globe -until you’re lucky enough to visit each one, we can access a bird’s eye-view. On this page you will be able to see who has won the most, who secured the most pole positions, and who has the fastest lap time in history at this circuit.
Now, Lewis Hamilton may be fast… But I think this crew might even be faster! Honestly, everything about Formula One racing impresses me. And we can conveniently collect that information all in one place! Let’s speed on to our first of three basketball apps!
5. 🏀 Shot Chart Explorer
One of the defining trends in the last decade of the NBA has been its growing emphasis on efficiency. It saw the rise of Moreyball and shot charts becoming fashionable parts of mainstream basketball coverage. The Shot Chart Explorer app maps the league, teams’ tendencies, strengths, and weaknesses and adds context utilizing AI.
The example below breaks down Brooklyn’s offense to separate its tactic-based (shot location) and execution-based (shotmaking) prowess. It shows Brooklyn’s outstanding shotmaking talent despite a 10th best location-based efficiency.
The app also identifies teams’ similarities based on location or accuracy profiles. Minnesota may be a dead ringer for Brooklyn for its shot locations, but they have a dramatically different accuracy profile, as confirmed by the shot charts.
Another look at the shot efficiencies chart above confirms that Minnesota lies about “fifty feet of crap” below Brooklyn, despite making similar shot location choices.
For those looking to understand impacts of situational variables such as number of dribbles or defender distance, a sensitivity analysis lays out an overview for each variable, and how it impacts the shot distance as well as accuracy, for each team, as well as for the NBA as a whole. Does the team get much of its points from catch & shoots? Are they, on average, more open than other teams? You can see it here:
And finally, filtered, in-depth shot charts allow the analyst to get down to the weeds in any data subset as they would like.
This data includes all regular season shots from 2016–17. Follow the links to see what you can find about your team, or configure the constraints to test your basketball hypotheses!
6. 🏀 NBA Player Performance and Scouting Explorer
In just a single season, the league’s player list is 500-strong. Looking over its history, that number swells to over 4,500. Given all that, it’s no surprise to see endless debates about who player X reminds us of, and whether player Y is more like a center or a power forward. As for the scouts, evaluating and projecting players is not any easier.
That’s where the NBA Player Performance and Scouting Explorer app comes in. It aims to provide an unbiased, structured, interactive way to trawl through what can be an overwhelming list of players and metrics.
The Player Finder is designed to help place a player in the context of the league in its history. The Player Finder arranges all players in the database according to a combination of physical and statistical similarities. Select your favorites, and adjust the weighting of physical or statistical similarities to see what happens.
That’s just the start of what this app can do. Frustrated by the limitations of grouping players by one of five traditional positions? Try the player grouping tool as demonstrated below.
This tool uses machine learning to cluster players by similar attributes and/or statistics, into any number of groups. In the above example, players are grouped by what might be typical metrics for point guards. Once the groups are derived, you can explore each group’s details further by seeing what positions make up the group, and how their statistics compare against the rest of the league.
Here, the AI has identified two clearly different groups (1 & 3). Even though their three point attempt rates are similar, their assist numbers are clearly differentiating factors. Here is group 1:
And now group 3:
And each group’s physical attributes can be put into the context of the league as a whole (see below: left: group 1, right: group 3), and any physical outliers identified. (You may notice a certain 221cm man who likes to hoist up three-pointers from Texas.)
This app can help to take player evaluations, comparisons and projections to the next level through its clever groupings and structured, unbiased outputs.
7. 🏀 Player Video Computer Vision Analysis
Our last app… returns in earnest to the “eye-test”. Fans can see a lot watching a game on TV. On the court, basketball is a fast game. Even for seasoned scouts and evaluators, it’s not easy to spot differences when limbs are flying on the hardwood, let alone to quantify minute changes in kinetic motion.
The Player Video Computer Vision Analysis app solves that problem by providing pose analysis outputs to any video. Take a look at the clip below, where the app analyses this YouTube video in real-time and adds landmarks to the figure of two-time MVP Steve Nash’s free throw form.
The outputs can be quantified, displayed, and analyzed, to learn how the best do what they do, or to identify your own players’ potential sources for improvement.
If you’re looking to identify the perfect free throw form from another two-time league MVP, see how we can map Elena Delle Donne’s form.
Steph Curry is quantifiably the best three-point shooter in NBA history. Sometimes, the “eye-test” doesn’t lie. Case-in-point: this legendary practice session.
The same app can easily be adapted to utilize videos from a webcam, or to compare two videos. The real-time nature of these outputs makes it ideal for training sessions, or to see if a player’s in-game mechanics start to falter under pressure or fatigue.
Maybe it could even be used to help quantify degrees of difficulties in dunk contests. (Aaron Gordon was robbed, for the record.)
Click the above links to see analysis in action! Or enter a YouTube URL on the app to track your favorite player’s kinesthetics!
1% of humans can dunk a basketball. 1% of humans can race a car at 200 mph. 1% can throw a 95 mph fastball. 1% can hit that fastball 425 ft for a homerun. You and I may only be able to imagine these feats of athletic prowess. Yet it is the power of our mind’s eye that allows us to create tools like the ones we just examined!
Sports Analytics combined with Data Visualization and powered by Plotly’s Dash Enterprise takes what our coaches, players, trainers, and executives can imagine, and transforms it into science. That science can help us achieve victory!
Maybe one day you’ll be able to visualize yourself lifting your team’s championship trophy high above your head for the world to see.