Pandas - A Powerful python Data Analysis Toolkit

Published in

featurepreneur

2 min readOct 8, 2022

A directory that contains a group of modules and sub-packages is a package where Pandas is said to be Package supported by Python programming language .

PANDAS >>PAN[Panel] + DAS[data].

Pandas is an open-source data analytics tool that are easy to use and follows the below five process

Analysis of data
Preparation of data
Data Manipulation
Data Modeling
Data Analysis

This package can be easily installed using the following command

pip install pandas

and version can be checked using

import pandas as pd
print(pd.__version__)

The most frequent keyword that one would come across in pandas is Series and Dataframe

SERIES — A column in pandas with a one-dimensional array holding a datatype

import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a)
print(myvar)

DATAFRAME — Multi-dimensional dataset of pandas

import pandas as pd
a = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}
myvar = pd.DataFrame(data)
print(myvar)

Viewing the data demands the usage of .head() , .tail() , .info() etc where they helps in viewing the dataset.

Data cleaning involves the removing of outlier that deviates from the mean value. Missing values are overcome using mean, median ,mode etc can be fixed using dropna().

Correlations can be brought using corr() keyword.

Matplotlib is a python package that can be closely used with pandas for visualizing the data.

import pandas as pdimport matplotlib.pyplot as pltdf = pd.DataFrame({'Name': ['John', 'Sammy', 'Joe'],'Age': [45, 38, 90]}}df.plot(x="Name", y="Age", kind="bar")

ADVANTAGES OF USING PANDAS

1) Pivot dataset.

2) Reshape datasets.

3) Label-oriented slicing.

4) Data Indexing and subsetting higher volume dataset.

5) Merging high-performance datasets in an efficient manner

6) Time series-functionality

The closely assosiated libraries of pandas are NumPy, Matplotlib etc. And hence due to its usage and importance it is used in vast spectra in data science .

Pandas - A Powerful python Data Analysis Toolkit

Written by Jeevitha M