Data visualization plots of seaborn
Seaborn is an amazing data visualization library for statistical graphics plotting in Python.
Using Seaborn we can plot wide varieties of plots like:
- Distribution Plots
- Pie Chart & Bar Chart
- Scatter Plots
- Pair Plots
- Heat maps
To initialize the Seaborn library, the command used is:
import seaborn as sns
Here, the data set used for visualizing plots is Drinking water data set from Kaggle.
Get your data
import pandasi
df=pandas.read_csv(‘Your Data set’)
1. Distplot
Dist plot gives us the histogram of the selected continuous variable.
import seaborn as sns
sns.distplot(df['ph'], bins = 10)
2. Joint Plot
It is the combination of the distplot of two variables. We additionally obtain a scatter plot between the variable to reflecting their linear relationship. We can customize the scatter plot into a hexagonal plot, where, more the colour intensity, the more will be the number of observations.
sns.jointplot(x = df[‘ph’], y = df[‘Hardness’], kind = ‘scatter’)
sns.jointplot(x = df['ph'], y = df['Hardness'], kind = 'hex')
3. Pair Plot
It takes all the numerical attributes of the data and plot a pairwise scatter plot for two different variables and histograms from the same variables.
sns.pairplot(df)
4. Count Plot
It counts the number of occurrences of categorical variables.
sns.countplot(df['Potability'])
5. Box Plot
It is a 5 point summary plot. It gives the information about the maximum, minimum, mean, first quartile, and third quartile of a continuous variable. Also, it equips us with knowledge of outliers.
We can plot this for a single continuous variable or can analyze different categorical variables based on a continuous variable.
sns.boxplot(y = df['ph'], x = df['Potability'])
6. Violin Plot
It is similar to the Box plot, but it gives supplementary information about the distribution too.
sns.violinplot(y = df['ph'], x = df['Potability'])
7. Strip Plot
It’s a plot between a continuous variable and a categorical variable. It plots as a scatter plot but supplementarily uses categorical encodings of the categorical variable.
sns.stripplot(y = df['ph'], x = df['Potability'])
8. Swarm Plot
It is the combination of a strip plot and a violin plot. Along with the number of data points, it also provides their respective distribution.
sns.swarmplot(y = df['ph'], x = df['Potability'])
9. Regression Plot
This is a more advanced statistical plot that provides a scatter plot along with a linear fitting of the data.
sns.lmplot(x = ‘ph’, y = ‘Hardness’, data = df, hue = ‘Potability’)
I hope this article would serve you as a tool for interrogating your data.
Thanks for Reading!!