# Deep Diving Pandas Groupby and Pivot

--

Pandas is a powerful data analysis library in Python that provides various functionalities to manipulate and analyze tabular data. Two important functions for data manipulation in Pandas are `groupby()`

and `pivot()`

. In this article, we will explore these two functions and provide examples to demonstrate their usage.

# Groupby()

The `groupby()`

function in Pandas is used to group data based on one or more columns. It is a powerful tool for data analysis as it allows you to group data based on certain criteria and then perform various calculations on each group.

Let's say we have a dataset containing information about employees in a company. We want to group the data by department and then calculate the average salary for each department. Here's how we can do that using the `groupby()`

function:

`import pandas as pd`

data = {'Employee': ['John', 'Anna', 'Peter', 'Samantha', 'David', 'Eric', 'Emily', 'Michael'],

'Department': ['Sales', 'Marketing', 'Sales', 'Marketing', 'Sales', 'Marketing', 'Sales', 'Marketing'],

'Salary': [60000, 65000, 55000, 70000, 50000, 75000, 45000, 80000]}

df = pd.DataFrame(data)

grouped = df.groupby(['Department'])['Salary'].mean()

print(grouped)

**Output:**

`Department`

Marketing 71750.0

Sales 52500.0

Name: Salary, dtype: float64

In this example, we first create a dictionary containing the data for our employees. We then create a DataFrame using this dictionary. Next, we group the DataFrame by the 'Department' column using the `groupby()`

function. Finally, we calculate the average salary for each department using the `mean()`

function.

# Pivot()

The `pivot()`

function in Pandas is used to reshape data from long to wide format. It allows you to transform rows into columns, and columns into rows. This function is particularly useful when you want to analyze data using a pivot table.

Let's say we have a dataset containing information about sales for a company. We want to create a pivot table that shows the total sales for each product category by month. Here's how we can do that using the `pivot()`

function:

`import pandas as pd`

data = {'Month': ['Jan', 'Feb', 'Mar', 'Jan', 'Feb', 'Mar', 'Jan', 'Feb', 'Mar'],

'Category': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],

'Sales': [10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000]}

df = pd.DataFrame(data)

pivot_table = df.pivot(index='Category', columns='Month', values='Sales')

print(pivot_table)

**Output:**

`Month Feb Jan Mar`

Category

A 20000.0 10000.0 30000.0

B 50000.0 40000.0 60000.0

C 80000.0 70000.0 90000.0

In this example, we first create a dictionary containing the data for our sales. We then create a DataFrame using this dictionary. Next, we use the `pivot()`

function to create a pivot table that shows the total sales for each product category by month. The `index`

parameter specifies the column to use as the index, the `columns`

parameter specifies the column to use as the column headers, and the `values`

parameter specifies the column to use as the values in the table.

# Groupby() and Pivot() together

The `groupby()`

and `pivot()`

functions can also be used together to perform more complex data analysis. Let's say we have a dataset containing information about sales for a company. We want to create a pivot table that shows the total sales for each product category by month, and then calculate the average sales for each category. Here's how we can do that using the `groupby()`

and `pivot()`

functions together:

`import pandas as pd`

data = {'Month': ['Jan', 'Feb', 'Mar', 'Jan', 'Feb', 'Mar', 'Jan', 'Feb', 'Mar'],

'Category': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],

'Sales': [10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000]}

df = pd.DataFrame(data)

pivot_table = df.pivot(index='Category', columns='Month', values='Sales')

grouped = df.groupby(['Category'])['Sales'].mean()

print(pivot_table)

print(grouped)

**Output:**

`Month Feb Jan Mar`

Category

A 20000.0 10000.0 30000.0

B 50000.0 40000.0 60000.0

C 80000.0 70000.0 90000.0

Category

A 20000.0

B 50000.0

C 80000.0

Name: Sales, dtype: float64

In this example, we first create a dictionary containing the data for our sales. We then create a DataFrame using this dictionary. Next, we use the `pivot()`

function to create a pivot table that shows the total sales for each product category by month. We then group the DataFrame by the 'Category' column using the `groupby()`

function. Finally, we calculate the average sales for each category using the `mean()`

function.

The `groupby()`

and `pivot()`

functions are powerful tools for data manipulation and analysis in Pandas. By understanding how to use these functions, you can perform complex data analysis tasks with ease.

**Please consider supporting my cousin’s clothing brand, you do not need to make a purchase simply following this post on Instagram is a blessing**: https://instagram.com/evestiaralifestyle?igshid=ZDdkNTZiNTM=

FREE PDF to Text CONVERTER Click here: Convert pdf to text for free!

*Plug:** Please purchase my book ONLY if you have the means to do so, I usually do not advertise, but I am struggling to stay afloat. **Imagination Unleashed: Canvas and Color, Visions from the Artificial: Compendium of Digital Art Volume 1 (Artificial Intelligence Draws Art) — Kindle edition by P, Shaxib, A, Bixjesh. Arts & Photography Kindle eBooks @ Amazon.com.*