Data Visualization using Matplotlib and Seaborn in Python

RADIO SAYS Arpit pathak
ML_with_Arpit_Pathak
6 min readJun 2, 2020

Hello readers , this blog explains about the basics of visualization of the data , type of graphs and the practical implementation of visualization by using python libraries like matplotlib and seaborn .

Data Visualization

Data visualization refers to the presentation of the data in the form of graphs to get better insights of the data and analyze the data in an smarter way .

It is well known and proven that the visual presentation of anything makes it possible to grab much more of it than reading it or listening it . Similarly , in the case of data , when we visualize and study it , we can have much better insights of it . Visualization of data makes it easier to understand that “what is it saying” or “what is the purpose of the data”. In order to study the data by visualization , w use graphs .

Graphs

A graph is a visualization tool for representing the data in a visual way that represents this data in the form of information by presenting it in the form of a series of co-ordinates on a multi-dimensional axis .

Let us see some basic graphs that can be used to visualize the data —

  1. Line Graph

Line graph . also known as the line chart is the representation of a series of points connected with each other . This type of graphs are used to visualize the changes in the value of something over time or any other type of constraint .

For example —

Line graph(source: here)

The figure here shows the change in the stock price of a company during seven days . The line is drawn by connecting the co-ordinates (day , price) with each other to from a line graph .

2. Bar Graph

A bar graph is the presentation of any type of categorical data in the form of rectangular bars where the height of each bar depends on the frequency/total count of that particular category in the data .

For example —

Bar graph (source: here)

The figure here shows the bar plot of a the kinds pets owned by the people at a certain place . We can see that cats are the mostly owned pets and rabbits are the least owned pets .

3. Histogram

A histogram is the presentation of the frequency of data that falls under a certain range of category . This type of graph takes the continuous valued data , groups it into a number of ranges(bins) and then represents the frequency of the data that falls in that particular bin .

For example —

Histogram(source : here)

The figure here is a histogram of the continuous data of weights which are divided into 9 bins (131–133 , 133–135,135–137,…and so on till 149) and the number of weights that fall in a particular bin is represented by the height id that bin .

4. Scatter Graph

A scatter plot is the presentation of the data points on multidimensional axis . It is a simple graph that represents each and every token of data as a co-ordinate point on the graph .

Scatter graph (source: here)

The figure here shows the scatter graph representing the data points as a co-ordinate on a 2-dimensional space (x-axis,y-axis) .

5. Heatmap

A heatmap is the representation of the frequency/density of the data visually by the use of some colour codings in 2-demensional space . The density of the data is presented in the form of intensity of the colur in the area on the graph .

For example —

Heatmap (source :here)

The figure here shows a heatmap of certain data plotted against the months and the years . We can see that the intensity of colour in the moth of July in 1960 is maximum and hence the frequency of data in that period is most .

6. Boxplot

A boxplot is the representation of the summary of data in which the distribution of the data over its whole frequency of values can be determined. It is the most standard way to analyze the outliers (odd) values in the data . The box plot displays the data as a distribution in the form of five occurances i.e ‘minimum’ , ‘first quartile’(Q1) , ‘median’ , ‘third quartile’(Q3) and ‘maximum’ . The area between Q1-Q3 is known as the Inter-Quartile range (IQR) that contains 50% of the overall data .

Boxplot (source: here)

Let us now see how we can plot these graphs using Matplotlib and Seborn in Python…

Matplotlib

Matplotlib is a python library that is used to represent or visualize the graphs on 2-dimensional axis (Note : we can also plot 3-D graphs using matplot3d ) . This library can be used to create static and interactive graphs in python . This is the most basic and simple library used to visualize the data in python . This library has its numerical extension with another library in python called ‘numpy’ .

Now we will use matplotlib to create graphs over the data we have on the Jupyter notebook of Anaconda that comes pre-installed with matplotlib .

First Import the matplotlib plotting library to plot the data —

import matplotlib.pyplot as plt

Let us now see some graphs created using Matplotlib —

  • Line Graph
  • Bar Graph
  • Histogram
  • Scatter Graph

Seaborn

Seaborn is a graphic library built on the top of matplotlib . It offers some more better tools and ways in which we can visualize our data in order to have a better insight of it . It offers color mapping and faceting of the data .

First import the seaborn library —

  • Heatmap
  • BoxPlot

That’s all about graphs in this blog . Hope it was an informative one . Thank you for reading…!!!

Github link for your reference : here

--

--