“Data Visualization” a beaten up topic with a Twist- Ep2|Combo Challenge
4 min readNov 1, 2023
Level: Beginner — Intermediate
Background and Context: This is episode two of the series. We will focus on Combo Charts and how different charting packages stand the test. In this episode, I will avoid the code sections for generic package imports. For more background and context please refer to Episode 1 below.
Matplot Lib/Pyplot
Line Bar Combo
Code
df_data1.plot(kind='bar', color='b', alpha=0.5, label='Technology Billionairs')
df_data2.plot(kind='line', color='r', marker='o', label='Automotive Billionairs')
plt.xticks(rotation=90)
plt.legend(loc="upper left", ncol=2)
plt.xlabel("Country")
plt.ylabel("# of Billionairs")
plt.title('Line Bar Graph of Income distribution of Billionairs by Industry')
plt.show()
Observations
- Code Complexity low, easy to implement, not much code required.
- Static Chart
- Not supported in OOTB, there is OOTB method to plot the combo chart
- Some customization options available
100% Stacked Charts
Code
# Prepare Data
cross_df = pd.crosstab(index=df['country'],
columns=df['category'],
normalize="index")
# Build Chart
ax=cross_df.plot(kind='bar',
stacked=True,
colormap='tab10',
figsize=(8, 6))
plt.legend(loc="lower left", ncol=2)
plt.xlabel("Country")
plt.ylabel("# of Billionairs")
plt.title('100% Stacked Chart for Distribution of billionairs by Industry and Country')
#Code to show percentage of each stacked sub-bar
for n, x in enumerate([*cross_df.index.values]):
for (proportion, y_loc) in zip(cross_df.loc[x],
cross_df.loc[x].cumsum()):
plt.text(x=n - 0.17,
y=y_loc,
s=f'{np.round(proportion * 100, 1)}%',
color="black",
fontsize=7,
fontweight="bold")
plt.show()
Observations
- Data preparation is primary to have a 100% stacked output
- Complex coding required to show the data values
- Not highly customizable
- Not supported in OOTB chart methods
- Static Chart
Seaborn
Line Bar Combo
Code
#Prepare Data
df = df[df['country'].isin(['United States','China','India','Taiwan'])]
df_data1 = df[df['category']=='Technology']['country'].value_counts()
df_data2 = df[df['category']=='Automotive']['country'].value_counts()
df_data1=df_data1.reset_index()
df_data1=df_data1.rename(columns = {"index":"Country","country":"NoofBlns"})
df_data2=df_data2.reset_index()
df_data2=df_data2.rename(columns = {"index":"Country","country":"NoofBlns"})
#Build Chart
line1 = sns.lineplot(df_data1.sort_values(by='Country'), x = 'Country', y ='NoofBlns', marker='s',color = 'b')
bar1=sns.barplot(df_data2.sort_values(by='Country'), x = 'Country', y = 'NoofBlns', color = "y" )
# Add Legends
line = mpatches.Patch(color='b', label='Technology')
bar = mpatches.Patch(color='yellow', label='Automotive')
plt.legend(handles=[line, bar])
plt.show()
Observations
- OOTB Support does not exist
- Data Preparation plays a major role
- Quite a bit of Custom coding necessary
- Static charts
100% Stacked Charts
Code
# Prepare Data
cross_df = pd.crosstab(index=df['country'],columns = df['category'],normalize="index").reset_index()
cross_df.set_index('country')
cross_df['Technology'] = 1
# Build chart
bar2 = sns.barplot(x="country", y="Technology", data=cross_df, color='orange')
bar1 = sns.barplot(x="country", y="Automotive", data=cross_df, color='darkblue')
plt.title('Billionaire Population by Country and Industry')
# Add legend
top_bar = mpatches.Patch(color='orange', label='Technology')
bottom_bar = mpatches.Patch(color='darkblue', label='Automotive')
plt.legend(handles=[top_bar, bottom_bar])
Observations
- OOTB Support does not exist
- Data Preparation plays a major role
- Custom coding necessary
- Static charts
Plotly
Line Bar Combo Chart
Code
import plotly.graph_objects as go
from plotly.subplots import make_subplots
# Prepare Data
df = df[df[df['country'].isin(['United States','China','India','Taiwan'])]['category'].isin(['Technology','Automotive'])]
cross_df = pd.crosstab(index=df['country'],
columns=df['category'])
# Build Chart
trace1 = go.Bar(
x=cross_df.index,
y=cross_df['Automotive'],
name='Automotive',
marker=dict(
color='rgb(34,163,192)'
)
)
trace2 = go.Scatter(
x=cross_df.index,
y=cross_df['Technology'],
name='Technology',
yaxis='y2',
marker=dict(
symbol='star'
)
)
fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(trace1)
fig.add_trace(trace2,secondary_y=True)
fig['layout'].update(height = 600, width = 800, title = "Line Bar Graph for billionair's Distribution by Industry",xaxis=dict(
tickangle=-90
))
fig.show()
Observation
- OOTB Support does not exist
- Comparatively more coding required
- Various customization are easy to build
- Interactive charts
100% Stacked Chart
Code
fig = px.histogram(df, x='country',color="category",barnorm="percent",
text_auto=True,
title="100 % Stacked Bar Chart of # of Billionair's Distribution by Industry")
fig.show()
Observations
- OOTB Support exist
- Minimal Coding required
- Various customization are easy o build
- Interactive charts