Create an Animated Bar Chart Race using Python and Matplotlib

Ben Ballard
4 min readJul 5, 2024

--

In this tutorial, we’ll walk through the process of creating an animated bar chart race using Python, Pandas, and Matplotlib. We’ll use NBA All-Star appearance data as an example, but you can apply this technique to any time-series data where you want to show rankings changing over time.

Prerequisites

Before we begin, make sure you have the following libraries installed: pandas, matplotlib, and numpy. You can install these using pip: pip install pandas matplotlib numpy

Step 1: Import Libraries and Load Data

First, let’s import the necessary libraries and load our data. If you want to use the same NBA All Star data, the code to pull that data is at the bottom of the article.

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import matplotlib.animation as animation
import numpy as np

# Read the data
df = pd.read_csv('all_star_mvp_awards.csv')

Step 2: Prepare the Data

Next, we’ll filter our data for All-Star appearances and prepare it for visualization:

# Filter for All-Star appearances and prepare the data
all_star_dpytf = df[df['DESCRIPTION'] == 'NBA All-Star'].copy()
all_star_df['SEASON'] = all_star_df['SEASON'].str[:4].astype(int) # Extract year from season

# Count appearances for each player by season
player_counts = all_star_df.groupby(['SEASON', 'PLAYER_NAME']).size().unstack(fill_value=0)

# Create a complete range of years and reindex
all_years = range(player_counts.index.min(), player_counts.index.max() + 1)
player_counts = player_counts.reindex(all_years, fill_value=0)
cumulative_counts = player_counts.cumsum()

Step 3: Set Up the Color Scheme

We’ll use a color scheme to make our chart visually appealing:

# Create a colormap
colors = plt.cm.viridis(np.linspace(0, 1, 20))
color_dict = {player: colors[i % len(colors)] for i, player in enumerate(cumulative_counts.columns)}

Step 4: Create the Update Function

The heart of our animation is the update function, which will be called for each frame:

def update(year):
ax.clear()
dff = cumulative_counts.loc[year].nlargest(10).sort_values(ascending=True)
ax.barh(dff.index, dff.values, color=[color_dict[x] for x in dff.index], edgecolor='black')

dx = dff.max() / 200

for i, (value, name) in enumerate(zip(dff.values, dff.index)):
ax.text(value-dx, i, name, size=14, weight=600, ha='right', va='bottom')
ax.text(value+dx, i, f'{value:,.0f}', size=14, ha='left', va='center')

ax.text(1, 0.4, str(year), transform=ax.transAxes, color='#777777', size=46, ha='right', weight=800)
ax.text(0, 1.06, 'All-Star Appearances', transform=ax.transAxes, size=12, color='#777777')
ax.xaxis.set_major_formatter(ticker.StrMethodFormatter('{x:,.0f}'))
ax.xaxis.set_ticks_position('top')
ax.tick_params(axis='x', colors='#777777', labelsize=12)
ax.set_yticks([])
ax.margins(0, 0.01)
ax.grid(which='major', axis='x', linestyle='-', alpha=0.2)
ax.set_axisbelow(True)
ax.text(0, 1.15, 'NBA Players with Most All-Star Appearances (1951-2024)',
transform=ax.transAxes, size=24, weight=600, ha='left', va='top')
ax.text(1, 0, 'By: Your Name', transform=ax.transAxes, ha='right',
bbox=dict(facecolor='white', alpha=0.8, edgecolor='white'))
plt.box(False)

max_value = cumulative_counts.max().max()
ax.set_xlim(0, max_value * 1.1)

Step 5: Create and Save the Animation

Finally, we’ll create the animation and save it as a GIF:

fig, ax = plt.subplots(figsize=(15, 8))

# Get the range of years actually present in the data
year_range = range(cumulative_counts.index.min(), cumulative_counts.index.max() + 1)
animator = animation.FuncAnimation(fig, update, frames=year_range, repeat=False)
animator.save('nba_allstar_appearances_race.gif', writer='pillow', fps=2)
plt.close()

Explanation of Key Components

  1. Data Preparation: We group the data by season and player, then create a cumulative sum to show total appearances over time.
  2. Color Scheme: We use the ‘viridis’ colormap to assign unique colors to each player, enhancing visual distinction.
  3. Update Function: This function clears the previous frame, plots the new data, and adds text labels and styling. It’s called for each year in our dataset.
  4. Animation: We use Matplotlib’s FuncAnimation to create the animation, calling our update function for each year.
  5. Saving: We save the animation as a GIF using the ‘pillow’ writer.

Customization

You can customize this chart in many ways:

  • Change the color scheme by using a different colormap
  • Adjust the number of players shown by modifying the nlargest() value
  • Change the animation speed by adjusting the fps parameter when saving
  • Modify text sizes, positions, and content to suit your needs

Getting the All Star Data

If you wanted to get the data to recreate that animated plot, or you want to view the NBA All Star data. Here is a way to pull the data in python, with the NBA API.

import time
import pandas as pd
from nba_api.stats.endpoints import playerawards
from nba_api.stats.static import players
from requests.exceptions import ReadTimeout

# Fetch all players
all_players = players.get_players()

# Function to get player awards with retry logic
def get_player_awards(player_id, retries=3, delay=5):
for attempt in range(retries):
try:
awards = playerawards.PlayerAwards(player_id=player_id)
awards_df = awards.get_data_frames()[0]
return awards_df
except ReadTimeout:
if attempt < retries - 1:
time.sleep(delay)
delay *= 2
else:
raise

# Initialize a DataFrame to store all awards
all_awards_df = pd.DataFrame()

# Loop through all players and get awards
for player in all_players:
player_id = player['id']
try:
player_awards_df = get_player_awards(player_id)
player_awards_df['PLAYER_NAME'] = player['full_name']
all_awards_df = pd.concat([all_awards_df, player_awards_df], ignore_index=True)
except ReadTimeout:
print(f"Timeout error for player {player['full_name']} (ID: {player_id}). Skipping.")

# Filter for All-Star and MVP awards
all_star_mvp_awards = all_awards_df[all_awards_df['DESCRIPTION'].str.contains('All-Star|MVP', case=False, na=False)]

# Display the first few rows
all_star_mvp_awards.head()

Conclusion

This tutorial showed you how to create an animated bar chart race using Python and Matplotlib. This technique can be applied to various datasets to create engaging, dynamic visualizations of changing rankings over time.

Remember to always ensure your data is clean and properly formatted, and don’t be afraid to experiment with different visual styles to find what works best for your specific dataset and audience.

Happy coding and animating!

--

--

Ben Ballard

Here for Data Science and Machine Learning. MS Data Science @UVA | Boomer | Mavs | Working on NBAanalytics.com.