# Descriptive Statistics in Pandas

## A guide on how to calculate descriptive statistics in Pandas.

# Loading Data

First of all, let’s import the libraries.

Let’s create a data frame named df and this dataset contains missing data.

You can calculate the sum of the columns with the sum method.

For the sum of rows, you can use the axis = “columns” or axis = 1.

You can calculate the mean of the rows using the mean method.

Note that by default, missing data were not included in the mean. If you want to take into account the missing data, you can use the skipna = False.

Let’s see the maximum values in rows and columns.

Let’s see the minimum values in rows and columns.

Let’s calculate the cumulative sums.

You can use the described method to see summary statistics of the dataset.

To find the correlation coefficient, let’s first import the famous the iris dataset. You can download iris data set from here.

Let’s take a look at the first five rows of iris dataset.

As you can see, there is no column name in the iris dataset. Let’s give the column name.

Let’s see the first five rows of the iris dataset again.

Let’s calculate the correlation between sepal length and sepal width.

You can use the corr method to see the binary correlation of all variables in a data frame.

You can use the cov method to see the binary covariance of all variables.

With the corrwith method, you can obtain binary comparisons between a variable and other variables in the dataset.

You can use the unique method to see the unique values. To show this, let’s create a Series named s.

Let’s use the unique method.

You can use the value_counts method to see the frequency of the values.

To control whether the value is in the dataset you can use the isin method.

Let’s see the rows with these values.

That’s it. I hope you enjoy this post. You can access the notebook here.