BerilC
6 min readAug 9, 2023
Photo by Om Kamath on Unsplash

In the realm of business analysis, delving into the intricacies of sales data reveals a wealth of insights that can guide strategic decisions. This article unveils the results of an in-depth analysis of Adidas US sales data for the years 2020-2021. Join us as we journey through the various stages of analysis and uncover key trends and patterns.

This data is taken from the kaggle dataset.

Our analysis commenced with a critical step: preprocessing the data. This phase involved examining and addressing missing data, identifying duplicate values, and transforming data types for compatibility. you can see it here

Digging into the numbers, we found that the average price per unit hovered around $45. The average number of units sold per transaction stood at 256, contributing to an impressive average total sales figure of $93,273.

df[['Price per Unit','Units Sold','Total Sales','Operating Margin']].describe()

Zooming in on regional sales, West Gear emerged as the leader, boasting a staggering total sales value of $242,964,333.

retailer = df.groupby('Retailer')['Total Sales'].sum().sort_values()
print(retailer)
plt.bar(retailer.index, retailer.values)
plt.title('Total sales by retailer')
plt.xlabel('retailer')
plt.ylabel('value')
plt.xticks(rotation=45)
plt.show()

Retailer
Walmart 74558410.0
Amazon 77698912.0
Kohl's 102114753.0
Sports Direct 182470997.0
Foot Locker 220094720.0
West Gear 242964333.0
Name: Total Sales, dtype: float64

The Western region emerged as the stronghold for West Gear, maintaining the highest average sales among retailers.

region_retail=df.groupby(['Retailer', 'Region'])['Total Sales'].sum().sort_values()
print(region_retail)
region_retail.plot(kind='bar')
plt.title('Retail Performance by Region')
plt.xlabel('retailer')
plt.ylabel('value')
plt.xticks(rotation=90)
plt.show()

Retailer       Region   
Amazon South 409091.0
Kohl's South 3552055.0
Walmart West 6791008.0
Foot Locker South 9307025.0
Amazon Southeast 10826333.0
Sports Direct West 12129045.0
Amazon West 13365025.0
Walmart Northeast 13712005.0
Kohl's Northeast 14031168.0
Amazon Midwest 16835873.0
West Gear Southeast 17491703.0
Walmart Southeast 21005539.0
Kohl's Midwest 22229415.0
West Gear Midwest 22540586.0
Sports Direct Northeast 24698097.0
Midwest 26207191.0
West Gear Northeast 32293733.0
Walmart South 33049858.0
West Gear South 33087031.0
Amazon Northeast 36262590.0
Foot Locker West 37804709.0
Midwest 47987394.0
Sports Direct Southeast 54178543.0
Foot Locker Southeast 59669118.0
Kohl's West 62302115.0
Sports Direct South 65258121.0
Foot Locker Northeast 65326474.0
West Gear West 137551280.0
Name: Total Sales, dtype: float64

However, Foot Locker claimed the highest profit margin, a noteworthy achievement with a value of 1102.

op_margin = df.groupby('Retailer')['Operating Margin'].sum().sort_values()
print(op_margin)
plt.bar(op_margin.index, op_margin.values)
plt.title('Operating Margin by Retailer')
plt.xlabel('Retailer')
plt.ylabel('value')
plt.xticks(rotation=45)
plt.show()

Retailer
Walmart 254.49
Amazon 396.56
Kohl's 431.87
Sports Direct 904.02
West Gear 991.99
Foot Locker 1102.09
Name: Operating Margin, dtype: float64

Meanwhile, Southeast Territory stood out for the highest sales volume, with Miami reigning as the city with the top sales figures, a remarkable $219,450.

region = df.groupby('Region')['Total Sales'].mean().sort_values(ascending=False)
print(region)
plt.bar(region.index, region.values, color="green")
plt.title('Total Sales by Region')
plt.xlabel('Region')
plt.ylabel('values')
plt.xticks(rotation=45)
plt.show()

Region
Southeast 133309.833333
West 110270.907680
South 83717.118634
Northeast 78419.220118
Midwest 72542.980235
Name: Total Sales, dtype: float64
city = df.groupby('City')['Total Sales'].mean().sort_values(ascending=False)
city=city.head(10)
print(city)

plt.bar(city.index, city.values, color='green')
plt.title('Top 10 City by Total Sales')
plt.xlabel('city')
plt.ylabel('values')
plt.xticks(rotation=45)
plt.show()

City
Miami 219450.437500
New York 184264.976852
Seattle 182852.208333
Albany 169637.527778
Charlotte 166364.798611
San Francisco 159903.796296
Honolulu 154739.284722
Denver 145809.277778
Charleston 138801.378472
Detroit 129343.284722
Name: Total Sales, dtype: float64

Our scrutiny of product categories unveiled that "Men’s Street Footwear" dominated sales, showcasing an impressive average sales figure of $368.

product=df.groupby('Product')['Units Sold'].mean().sort_values(ascending=False)
product

Product
Men's Street Footwear 368.521739
Men's Athletic Footwear 270.513043
Women's Apparel 269.792910
Women's Street Footwear 243.948383
Women's Athletic Footwear 197.531756
Men's Apparel 190.960772
Name: Units Sold, dtype: float64
sales_product = df.groupby('Product')['Total Sales'].sum().sort_values(ascending=False)

plt.bar(sales_product.index, sales_product.values, color='magenta')
plt.title('Total Sales by Product Categories')
plt.xlabel('Product')
plt.ylabel('Values')
plt.xticks(rotation=45)
plt.show()

A year-on-year comparison showcased a remarkable 59.6% increase in sales during 2021.

year=df.groupby('year')['Total Sales'].sum()
print(year)
plt.bar(year.index, year.values, color='pink')
plt.title('Total sales in year')
plt.xlabel('label')
plt.ylabel('values')
plt.show()

year
2020 182080675.0
2021 717821450.0
Name: Total Sales, dtype: float64
year20 =df[df['year'] == 2020]['Total Sales'].sum()
year21 =df[df['year'] == 2021]['Total Sales'].sum()
total =df['Total Sales'].sum()

avg = year21 - year20
percent = (avg / total)*100
print(percent)

59.5332270162158

Delving deeper, we observed a significant surge in purchases during March-April of 2020, while 2021 witnessed substantial spikes in June, July, and August.

sale20 = df[df['year']==2020]

sales20 = sale20.groupby('month')['Total Sales'].sum()

all_months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August','September', 'October', 'November', 'December']

all_sale20 = [sales20.get(month, 0) for month in all_months]

plt.plot(all_months, all_sale20, marker='o', linestyle="-", color='red', label='2020')
plt.title('total sales 2020')
plt.xlabel('mounths')
plt.ylabel('value')
plt.xticks(rotation=45)
plt.legend()

plt.tight_layout()
plt.show()

sale21 = df[df['year']==2021]

sales21 = sale21.groupby('month')['Total Sales'].sum()

all_months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August','September', 'October', 'November', 'December']

all_sale21 = [sales21.get(month, 0) for month in all_months]

plt.plot(all_months, all_sale21, marker='o', linestyle="-", color='red', label='2020')
plt.title('total sales 2021')
plt.xlabel('mounths')
plt.ylabel('value')
plt.xticks(rotation=45)
plt.legend()

plt.tight_layout()
plt.show()

Remarkably, "Men’s Street Footwear" maintained its dominance throughout.

year20 = df[df['year'] == 2020]

product_sales = year20.groupby(['month', 'Product'])['Total Sales'].sum().unstack()

all_months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August','September', 'October', 'November', 'December']

for product in product_sales.columns:
    plt.plot(all_months, product_sales[product], marker='o', linestyle='-', label=product)

plt.title('Total Sales of Products in 2020')
plt.xlabel('Months')
plt.ylabel('Value')
plt.xticks(rotation=45)
plt.legend()

plt.tight_layout()
plt.show()

year21 = df[df['year'] == 2021]

product_sales = year21.groupby(['month', 'Product'])['Total Sales'].sum().unstack()

all_months = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August','September', 'October', 'November', 'December']

for product in product_sales.columns:
    plt.plot(all_months, product_sales[product], marker='o', linestyle='-', label=product)

plt.title('Total Sales of Products in 2021')
plt.xlabel('Months')
plt.ylabel('Value')
plt.xticks(rotation=45)
plt.legend()

plt.tight_layout()
plt.show()

Unsurprisingly, a positive correlation emerged between sales volume and profitability. The higher the sales, the larger the profits—a key insight for crafting future strategies.

plt.scatter(df['Total Sales'], df['Operating Profit'], alpha=0.5)
plt.title('Total Sales vs Operating Profit')
plt.xlabel('Total sales')
plt.ylabel('Operating Profit')
plt.show()

Our data indicated a pronounced inclination of customers toward shopping in physical stores, shedding light on consumer behavior.

sales = df.groupby('Sales Method')['Total Sales'].sum().sort_values(ascending=False)
print(sales)

plt.bar(sales.index, sales.values)
plt.title('Total Sales by Method')
plt.xlabel('method')
plt.ylabel('value')
plt.show

In conclusion, our data-driven odyssey through Adidas US sales data for 2020-2021 has unveiled a treasure trove of insights. From regional performance to product dominance, seasonal surges to the power of higher sales, this analysis equips decision-makers with a compass for navigating future endeavors. With these revelations in mind, businesses can chart a course for success in the dynamic world of retail.