Create an Animated Bar Chart Race using Python and Matplotlib
In this tutorial, we’ll walk through the process of creating an animated bar chart race using Python, Pandas, and Matplotlib. We’ll use NBA All-Star appearance data as an example, but you can apply this technique to any time-series data where you want to show rankings changing over time.
Prerequisites
Before we begin, make sure you have the following libraries installed: pandas, matplotlib, and numpy. You can install these using pip: pip install pandas matplotlib numpy
Step 1: Import Libraries and Load Data
First, let’s import the necessary libraries and load our data. If you want to use the same NBA All Star data, the code to pull that data is at the bottom of the article.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import matplotlib.animation as animation
import numpy as np
# Read the data
df = pd.read_csv('all_star_mvp_awards.csv')
Step 2: Prepare the Data
Next, we’ll filter our data for All-Star appearances and prepare it for visualization:
# Filter for All-Star appearances and prepare the data
all_star_dpytf = df[df['DESCRIPTION'] == 'NBA All-Star'].copy()
all_star_df['SEASON'] = all_star_df['SEASON'].str[:4].astype(int) # Extract year from season
# Count appearances for each player by season
player_counts = all_star_df.groupby(['SEASON', 'PLAYER_NAME']).size().unstack(fill_value=0)
# Create a complete range of years and reindex
all_years = range(player_counts.index.min(), player_counts.index.max() + 1)
player_counts = player_counts.reindex(all_years, fill_value=0)
cumulative_counts = player_counts.cumsum()
Step 3: Set Up the Color Scheme
We’ll use a color scheme to make our chart visually appealing:
# Create a colormap
colors = plt.cm.viridis(np.linspace(0, 1, 20))
color_dict = {player: colors[i % len(colors)] for i, player in enumerate(cumulative_counts.columns)}
Step 4: Create the Update Function
The heart of our animation is the update function, which will be called for each frame:
def update(year):
ax.clear()
dff = cumulative_counts.loc[year].nlargest(10).sort_values(ascending=True)
ax.barh(dff.index, dff.values, color=[color_dict[x] for x in dff.index], edgecolor='black')
dx = dff.max() / 200
for i, (value, name) in enumerate(zip(dff.values, dff.index)):
ax.text(value-dx, i, name, size=14, weight=600, ha='right', va='bottom')
ax.text(value+dx, i, f'{value:,.0f}', size=14, ha='left', va='center')
ax.text(1, 0.4, str(year), transform=ax.transAxes, color='#777777', size=46, ha='right', weight=800)
ax.text(0, 1.06, 'All-Star Appearances', transform=ax.transAxes, size=12, color='#777777')
ax.xaxis.set_major_formatter(ticker.StrMethodFormatter('{x:,.0f}'))
ax.xaxis.set_ticks_position('top')
ax.tick_params(axis='x', colors='#777777', labelsize=12)
ax.set_yticks([])
ax.margins(0, 0.01)
ax.grid(which='major', axis='x', linestyle='-', alpha=0.2)
ax.set_axisbelow(True)
ax.text(0, 1.15, 'NBA Players with Most All-Star Appearances (1951-2024)',
transform=ax.transAxes, size=24, weight=600, ha='left', va='top')
ax.text(1, 0, 'By: Your Name', transform=ax.transAxes, ha='right',
bbox=dict(facecolor='white', alpha=0.8, edgecolor='white'))
plt.box(False)
max_value = cumulative_counts.max().max()
ax.set_xlim(0, max_value * 1.1)
Step 5: Create and Save the Animation
Finally, we’ll create the animation and save it as a GIF:
fig, ax = plt.subplots(figsize=(15, 8))
# Get the range of years actually present in the data
year_range = range(cumulative_counts.index.min(), cumulative_counts.index.max() + 1)
animator = animation.FuncAnimation(fig, update, frames=year_range, repeat=False)
animator.save('nba_allstar_appearances_race.gif', writer='pillow', fps=2)
plt.close()
Explanation of Key Components
- Data Preparation: We group the data by season and player, then create a cumulative sum to show total appearances over time.
- Color Scheme: We use the ‘viridis’ colormap to assign unique colors to each player, enhancing visual distinction.
- Update Function: This function clears the previous frame, plots the new data, and adds text labels and styling. It’s called for each year in our dataset.
- Animation: We use Matplotlib’s
FuncAnimation
to create the animation, calling our update function for each year. - Saving: We save the animation as a GIF using the ‘pillow’ writer.
Customization
You can customize this chart in many ways:
- Change the color scheme by using a different colormap
- Adjust the number of players shown by modifying the
nlargest()
value - Change the animation speed by adjusting the
fps
parameter when saving - Modify text sizes, positions, and content to suit your needs
Getting the All Star Data
If you wanted to get the data to recreate that animated plot, or you want to view the NBA All Star data. Here is a way to pull the data in python, with the NBA API.
import time
import pandas as pd
from nba_api.stats.endpoints import playerawards
from nba_api.stats.static import players
from requests.exceptions import ReadTimeout
# Fetch all players
all_players = players.get_players()
# Function to get player awards with retry logic
def get_player_awards(player_id, retries=3, delay=5):
for attempt in range(retries):
try:
awards = playerawards.PlayerAwards(player_id=player_id)
awards_df = awards.get_data_frames()[0]
return awards_df
except ReadTimeout:
if attempt < retries - 1:
time.sleep(delay)
delay *= 2
else:
raise
# Initialize a DataFrame to store all awards
all_awards_df = pd.DataFrame()
# Loop through all players and get awards
for player in all_players:
player_id = player['id']
try:
player_awards_df = get_player_awards(player_id)
player_awards_df['PLAYER_NAME'] = player['full_name']
all_awards_df = pd.concat([all_awards_df, player_awards_df], ignore_index=True)
except ReadTimeout:
print(f"Timeout error for player {player['full_name']} (ID: {player_id}). Skipping.")
# Filter for All-Star and MVP awards
all_star_mvp_awards = all_awards_df[all_awards_df['DESCRIPTION'].str.contains('All-Star|MVP', case=False, na=False)]
# Display the first few rows
all_star_mvp_awards.head()
Conclusion
This tutorial showed you how to create an animated bar chart race using Python and Matplotlib. This technique can be applied to various datasets to create engaging, dynamic visualizations of changing rankings over time.
Remember to always ensure your data is clean and properly formatted, and don’t be afraid to experiment with different visual styles to find what works best for your specific dataset and audience.
Happy coding and animating!