Matplotlib for Machine Learning

Paritosh Mahto
MLpoint
Published in
20 min readAug 10, 2020

Matplotlib is one of the most popular and oldest plotting libraries in Python which is used in Machine Learning. In Machine learning, it helps to understand the huge amount of data through different visualisations.

Now, let us explore more about Matplotlib.

Contents
1.Introduction to Matplotlib
2. How to Install?
3. How to import?
4.Understanding the basics of Graph/Plots using Matplotlib
5.Important plots used in Machine Learning
6.Three-Dimensional Plotting with Matplotlib

1. Introduction to Matplotlib
Matplotlib is an open-source plotting library in Python introduced in the year 2003. It is a very comprehensive library and designed in such a way that most of the functions for plotting in MATLAB can be used in Python.

It consists of several plots like the Line Plot, Bar Plot, Scatter Plot, Histogram e.t.c through which we can visualise various types of data.

2. How to Install?

#Windows, Linus, MacOS users can install this library using the following command:
python -mpip install -U matplotlib
#To install Matplotlib in Jupyter Notebook run the following command:
pip install matplotlib
#To install Matplotlib in Anaconda Prompt use the following command:
conda install matplotlib

3. How to import?

#importing pyplot module from matplotlib
from matplotlib import pyplot as plt
#or
import matplotlib.pyplot as plt

4.Understanding the basics of Graph/Plots using Matplotlib

source:https://matplotlib.org/3.2.1/gallery/showcase/anatomy.html

a)Figure:
It is bounded space within which one or more graphs can be plotted or in other words, it contains one or more than one axes (plots).

To create a new figure:

matplotlib.pyplot.figure( figsize=None, dpi=None, facecolor=None, edgecolor=None)

Creating a new figure:

import matplotlib.pyplot as pltfig=plt.figure(figsize=[6.4, 4.8],facecolor='skyblue', edgecolor='black',dpi=360)#If you run this code you will get this output
output:<Figure size 768x576 with 0 Axes>
#this means a new figure is created whose size is 768x576

b)Axes:
A figure contains usually more than one axes (plots) and it may contain two or three in case of three-dimensional structure or objects. Each Axes has a title, an X –label, and Y –label.

To add axes to the current figure:

matplotlib.pyplot.axes(arg=None,projection=None,xlabel=None,ylabel=None,title=None)

Example:

#adding axes to the above figure:import matplotlib.pyplot as pltfig=plt.figure(figsize=[6.4, 4.8],facecolor='skyblue', edgecolor='black',dpi=100)#adding axes to figax = fig.add_axes([0,0,1,1], projection='rectilinear',xlabel='X-label',ylabel='Y-label',title='Creating new Figure & Axes')

output:

Different parameters of figure and axes

c)Axis:
It is a fixed reference line for the measurement of coordinates

To add axis use matploylib.pyplot.axis(*args, **kwargs)

parameter: min, xmax, ymin, ymax : float, optional

The axis limits to be set. Either none or all of the limits must be given.

ax.set(xlim=(xmin, xmax), ylim=(ymin, ymax))

d)Label:
It is used to assign names to x-axis and y-axis or title of the graph/plot. It is also used to represent edges and/or vertices of a graph with integers or names.

To assign labels use the following functions:
for x&y axis

matplotlib.pyplot.xlabel(xlabel, fontdict=None, labelpad=None)

matplotlib.pyplot.ylabel(ylabel, fontdict=None, labelpad=None)

e)X-tick &Y-tick:
Ticks are the markers which help to denote data points on the axes.

i)Marking data points with ticks using set_xticks()

#marking the data points at the given positions with ticks.ax.set_xticks([2,4,6,8,10])ax.set_yticks([1,2,3,4,5,6,7,8,9,10])

ii)Labelling the marked ticks using set_xticklabels() and set_yticklabels()

#labelling the marked ticks using set_xticklabels() and set_yticklabels() functionsax.set_xticks([2,4,6,8,10])ax.set_xticklabels(['Day2','Day4','Day6','Day8','Day10'])ax.set_yticks([1,2,3,4,5,6,7,8,9,10])ax.set_yticklabels(['Label1','Label2','Label3','Label4','Label5','Label6','Label7','Label8','Label9','Label10'])

f)Legend:
Legends are a useful way to label data series plotted on a graph.It helps the readers to understand the plotted data.

To add legends use this function:

matplotlib.pyplot.legend([‘label1’,’label2’],facecolor=None,loc=’upper center’,shadow=True,edgecolor=None ,title=None))

To locate String in different locations of graph use loc =’ — — ’
best
upper right
upper left
lower left
lower right
right
center left
center right
lower center
upper center
center

Let us understand with a simple plot

import matplotlib.pyplot as  plt
import numpy as np
import math
fig=plt.figure(figsize=[6.4, 4.8],facecolor='skyblue', edgecolor='black',dpi=100)ax = fig.add_axes([0,0,1,1], projection='rectilinear',xlabel='angle',ylabel='values',title='sine wave',facecolor='w')x = np.arange(0, math.pi*2, 0.05)
y = np.sin(x)
y1=np.cos(x)
plt.plot(x,y,color='black')
plt.plot(x,y1,color='r')
plt.legend(['sine-wave','cosine-wave'],facecolor='w',loc='upper center',shadow=True,edgecolor='black',title='Example')

g)Grid lines:

Grid lines are lines that cross the chart plot to show axis divisions. These lines help viewers of the chart to see what value is represented by an unlabeled data point. Grid lines come in two types: major and minor.

To add grid lines in any plot use this function:

matplotlib.pyplot.grid(b=None, which='major', axis='both', **kwargs)

Example-adding grid lines in the above-plotted figure

ax.grid(b=True,color='grey',linestyle='-', linewidth=1, which='both', axis='both')

h)Some important functions
i)matplotlib.pyplot.show()- This function is used to display the figure.
ii)matplotlib.pyplot.savefig()-This function is used to save any figure.
iii)matplotlib.pyplot.title() -This function is used to set title for the axes in any figure.
iv)matplotlib.pyplot.annotate()-This function helps to add any comment or text or any arrows /marks in the figure.

5.Important plots used in Machine Learning
There are many matplotlib plots which are used in Machine Learning for analysis and visualisations. Following plots are widely used in machine learning.

a)Line Plot:
A line plot is a type of chart or graph which displays information as a series of data points called ‘markers’ connected by straight line segments.

It can be plotted using matplotlib.pyplot.plot() function.

matplotlib.pyplot.plot(x,y, scalex=True, scaley=True, data=None, color=None)

i)Simple Line Plot

#importing library
import matplotlib.pyplot as plt
#data-points or markers
values = [1, 5, 8, 9, 7, 11, 8, 12, 14, 9]
plt.plot(values,color='red',alpha=0.6)
plt.savefig('plot14.png', dpi=300, bbox_inches='tight')
plt.show()

Example-1

#importing library
import matplotlib.pyplot as plt
#data-points or markerssales1 = [1, 5, 8, 9, 7, 11, 8, 12, 14, 9, 5]
sales2 = [3, 7, 9, 6, 4, 5, 14, 7, 6, 16, 12]
#plotting line plotsline_chart1 = plt.plot(range(1,12), sales1,color='black')
line_chart2 = plt.plot(range(1,12), sales2,color='r',alpha=0.6)
#creating points
plt.plot(range(1,12), sales2,'C0o', alpha=0.5,color='black')
plt.plot(range(1,12), sales1,'C0o', alpha=0.5,color='orange')
#creating title,Y-label,X-label & legend
plt.title('Monthly sales of 2018 and 2019')
plt.ylabel('Sales')
plt.xlabel('Month')
plt.legend(['year 2018', 'year 2019'], loc=4)
#saving figure to my device in .png format
plt.savefig('plot15.png', dpi=300, bbox_inches='tight')
plt.show()

Note:- It is often used to visualize a trend in data over intervals of time.

Example -2: Simple Sine -Wave Plot

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = plt.axes()
ax.grid(color='black', linestyle='-', linewidth=2,alpha=0.1)
#creating numeric sequence with evenly spaced
x = np.linspace(0, 10, 1000)
plt.plot(x, np.sin(x - 0), color='black',alpha=0.8,label='sin x')
plt.plot(x, np.sin(x - 1), color='orange',label='sin(x-1)')
plt.plot(x, np.sin(x - 2),color='r' ,label='sin(x-2)')
plt.plot(x, np.sin(x - 3), color='pink',label='sin(x-3)')

plt.legend(loc='best')

plt.show()

b)Area plot:
It is based on the line plot and displays quantitative data graphically. The area between axis and line are commonly emphasized with colours, textures or hatchings. It helps to compare two or more quantities with the same graph.

Area plot can be plotted using matplotlib.pyplot.plot() function.Let us understand with few examples.

Example-1

#importing librariesimport numpy as np
import pandas as pd
import matplotlib.pyplot as plt
turnover = [12, 17, 14, 15, 20, 27, 30, 38, 35, 30,28,15]plt.fill_between(np.arange(12), turnover,
color="skyblue", alpha=0.5)
plt.plot(np.arange(12), turnover, color="black",
alpha=0.6, linewidth=2)
plt.title('Year 2016')
plt.tick_params(labelsize=12)
plt.xticks(np.arange(12), np.arange(1,13))
plt.xlabel('Month', size=12)
plt.ylabel('crude-oil(in metric tonnes)', size=12)
plt.ylim(bottom=0)
plt.show()

Example-2

import numpy as np
import matplotlib.pyplot as plt
year_n_1 = [15, 13, 10, 13, 22, 36, 30, 33, 24.5, 15, 16.5, 11.2]
year_n = [12, 17, 14, 16, 20, 27, 30, 38, 25, 18, 6, 11]
plt.fill_between(np.arange(12), year_n_1, color="lightpink",
alpha=0.5, label='year 2017')
plt.fill_between(np.arange(12), year_n, color="skyblue",
alpha=0.5, label='year 2018')
plt.title('Consumption of crude-oil(in metric tonnes) per year')plt.tick_params(labelsize=12)
plt.xticks(np.arange(12), np.arange(1,13))
plt.xlabel('Month', size=12)
plt.ylabel('crude-oil(in metric tonnes)', size=12)
plt.ylim(bottom=0)
plt.legend()
plt.show()

c)Stacked Area Plot:
Stacked area plot is an extension of a basic area plot or chart. It displays the evolution of the value of several groups on the same graphic.

  • It represents cumulated total using numbers or percentages over time and visualizes part-to-whole relationships.
  • The values of each group are displayed on top of each other.
  • It also shows how each category contributes to the cumulative total.

It can be plotted using matplotlib.pyplot.stackplot() function.

matplotlib.pyplot.stackplot(x, *args, labels=(), colors=None, baseline='zero', data=None)

Let us understand with few examples.


import numpy as np
import matplotlib.pyplot as plt

# FORMAT 1

# Your x and y axis
x=range(1,6)
y=[ [1,4,6,8,9], [2,2,7,10,12], [2,8,5,10,6] ]
# Basic stacked area chart.
plt.stackplot(x,y, labels=['A','B','C'],colors= ['orange','c','grey'])
plt.legend(loc='upper left')
plt.show()
#FORMAT 2
x=range(1,6)
y1=[1,4,6,8,9]
y2=[2,2,7,10,12]
y3=[2,8,5,10,6]

# Basic stacked area chart.
plt.stackplot(x,y1, y2, y3, labels=['A','B','C'])
plt.legend(loc='upper left')
plt.show()

c)Scatter Plot:
It shows the relationship between two variables. It displays the value of 2 sets of data on two-dimensions. Each dot represents an observation.

It can be plotted using matplotlib.pyplot.scatter() function.

matplotlib.pyplot.scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None, vmin=None, vmax=None, alpha=None, linewidths=None, verts=None, edgecolors=None,data=None, )

parameters:

Example-1

#importing libraries
import matplotlib.pyplot as plt
import numpy as np
# Fixing random state for reproducibility
np.random.seed(19680801)
N = 50
x = np.random.rand(N)
y = np.random.rand(N)
#creating scatter plotplt.scatter(x, y,c='black', alpha=0.5)plt.show()

d)Bubble Plot:
It is a type scatter plot where a third dimension is added. The value of an additional variable is represented through the size of the dots.

Example-1import matplotlib.pyplot as plt
import numpy as np
# Fixing random state for reproducibility
np.random.seed(19680801)
N = 50
x = np.random.rand(N)
y = np.random.rand(N)
area = (30 * np.random.rand(N))**2 # 0 to 15 point radiiplt.scatter(x, y, s=area, c='black', alpha=0.5)
plt.show()

In the above plot, you can observe the different sizes of the bubbles. It represents the third dimension.

Example-2 import matplotlib.pyplot as plt
import numpy as np
# Fixing random state for reproducibility
np.random.seed(19680801)
N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
area = (30 * np.random.rand(N))**2 # 0 to 15 point radiiplt.scatter(x, y, s=area, c=colors, alpha=0.5)
plt.show()

This plot is the same as above but bubbles are filled with different colours which represent the fourth dimension.

e) Bar Plot:
A bar plot is a plot that presents categorical data with rectangular bars with lengths proportional to the values that they represent. One axis of the plot shows the specific categories being compared, and the other axis represents a measured value.

i)Simple Bar Plot

import matplotlib.pyplot as plt
%matplotlib inline
#creating a new figure
fig = plt.figure()
#adding axes to the figure
ax = fig.add_axes([0,0,1,1])
langs = ['C', 'C++', 'Java', 'Python', 'PHP','R']
students = [5000,6300,10000,9000,3000,4000]
#Y-label
plt.ylabel('Students')
#X-lable
plt.xlabel('Programming languages')
#title
plt.title('No. of Students learning Programming language')
#creating bar-plot
ax.bar(langs,students,width=0.5,color='black',alpha=0.7)
#saving the figure in .png format
plt.savefig('plot.png', dpi=300, bbox_inches='tight')
plt.show()

ii) Grouped Bar Plot:
It is a type of bar plot in which multiple sets of data items are compared, with a single colour used to denote a specific series across all sets.

#importing librariesimport numpy as np
import matplotlib.pyplot as plt
#datas to be plotteddata = [[30, 25, 50, 20],
[40, 23, 51, 17],
[35, 22, 45, 19]]
label=['Match-1','Match-2','Match-3','Match-4']
X = np.arange(len(label))
y=[0.25,1.25,2.25,3.25]
#creating figure,axex,title&labels
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
plt.xlabel('Match')
plt.ylabel('Score')
plt.title('Comparing scores of three players in different cricket matches')
plt.xticks(y,label)
#creating bar plots
ax.bar(X + 0.00, data[0], color = 'black', width = 0.25,alpha=0.5,label='Ricky Ponting')
ax.bar(X + 0.25, data[1], color = 'orange', width = 0.25,alpha=0.6,label='Sachin Tendulkar')ax.bar(X + 0.50, data[2], color = 'red', width = 0.25,alpha=0.5,label='Adam Gilchrist')plt.legend()
plt.show()

iii)Stacked Bar Plot:
It is a type of bar plot. t is used to show how a larger category is divided into smaller categories and what the relationship of each part has on the total amount.

Example:

#importing libraries
import numpy as np
import matplotlib.pyplot as plt
male = (20, 35, 30, 35, 27)
female = (25, 32, 34, 20, 25)
ind = np.arange(5) # the x locations for the groups
plt.bar(ind, male, width= 0.35 ,color='black',alpha=0.8,label='Male')plt.bar(ind, female, width= 0.35 ,bottom=male,color='orange',label='Feamle')plt.ylabel('Age(in Years)')
plt.title('Age of employees by group and gender')
plt.xticks(ind, ('G1', 'G2', 'G3', 'G4', 'G5'))
plt.yticks(np.arange(0, 81, 10))
plt.legend()
plt.show()

iv)Bar Plot on Polar axis:
It is used to display data as bars drawn from the centre of an angle axis to a data value.

If more than one variable is used, the bars can be stacked on top of one another or they can be adjacent to each other. When adjacent bars are used the values are plotted side by side. When stacked bars are used, the total length of the stacked bars is equal to the sum of the data values.

import numpy as np
import matplotlib.pyplot as plt
# Fixing random state for reproducibility
np.random.seed(19680801)
# Compute pie slices
N = 20
theta = np.linspace(0.0, 2 * np.pi, N, endpoint=False)
radii = 10 * np.random.rand(N)
width = np.pi / 4 * np.random.rand(N)
colors = plt.cm.viridis(radii / 10.)
ax = plt.subplot(111, projection='polar')
ax.bar(theta, radii, width=width, bottom=0.0, color=colors, alpha=0.5)
plt.show()

f)Histogram:
It is an accurate representation of the distribution of numerical data. It is an estimate of the probability distribution of a continuous variable. It is a kind of bar graph.

It can be plotted using matplotlib.pyplot.hist() function.

matplotlib.pyplot.hist(x, bins=None, range=None, density=None, weights=None, cumulative=False, bottom=None, histtype='bar', align='mid', orientation='vertical', rwidth=None, log=False, color=None, label=None, stacked=False, normed=None, data=None, )

Example-1


from matplotlib import pyplot as plt
import numpy as np
fig,ax = plt.subplots(1,1)marks = np.array([22,87,5,43,56,73,55,54,11,20,51,5,79,31,27])
ax.hist(marks ,
bins =[0,20,40,60,80,100],color='green',alpha=0.4,edgecolor='black')
ax.set_title("Histogram of Result")
ax.set_xticks([0,20,40,60,80,100])
ax.set_xlabel('marks')
ax.set_ylabel('no. of students')

plt.show()

ii)Step type Histogram

from matplotlib import pyplot as plt
import numpy as np
fig,ax = plt.subplots(1,1)
marks = np.array([22,87,5,43,56,73,55,54,11,20,51,5,79,31,27])
ax.hist(marks , bins = [0,20,40,60,80,100],color='green',alpha=0.4,histtype='step')ax.set_title("Histogram of Result")
ax.set_xticks([0,20,40,60,80,100])
ax.set_xlabel('marks')
ax.set_ylabel('no. of students')

plt.show()

This plot is step type histogram, same as above plot but bars are not filled with any colours.

iii)Multiple Histograms

from matplotlib import pyplot as plt
import numpy as np
vfig,ax = plt.subplots(1,1)value1=np.array([82,76,24,40,67,62,75,78,71,32,98,89,78,67,72,82,87,66,56,52])
value2=np.array([62,5,6,91,25,36,32,96,41,42,43,95,3,90,95,32,27,55,100,15,71,11,37,21])
ax.hist(value1 , bins = [0,10,20,30,40,50,60,70,80,90,100],color='black',alpha=0.6,histtype='bar',edgecolor='black')ax.hist(value2 , bins = [0,10,20,30,40,50,60,70,80,90,100],color='skyblue',alpha=0.7,histtype='bar',edgecolor='black')plt.title('Multiple Histograms')plt.show()

g)Density Plot:
Density Plot represents the distribution of any numerical data set. This plot is very useful in univariate analysis of data.

Example:

from matplotlib import pyplot as plt
import pandas as pd
score= np.array([22,56,5,68,58,73,55,54,11,20,71,5,79,31,27])
data=pd.DataFrame(score)
data.plot(kind='density', layout=(3,3),color='red',alpha=0.6)
plt.legend(['Score'])
plt.show()

h)Pie Plot:
It is a circular statistical graphic, which is divided into slices to illustrate numerical proportion. In a pie chart, the arc length of each slice is proportional to the quantity it represents.

It can be plotted using matplotlib.pyplot.pie() function.

matplotlib.pyplot.pie(x, explode=None, labels=None, colors=None, autopct=None, pctdistance=0.6, shadow=False, labeldistance=1.1, radius=None, counterclock=True,center=(0, 0), frame=False, rotatelabels=False, data=None)

i)Basic Pie Plot

import matplotlib.pyplot as plt# Pie chart, where the slices will be ordered and plotted counter-clockwise:
labels = 'English', 'Spanish', 'Russian', 'French','Chinese'
sizes = [55, 15, 8, 10,12]
explode = (0, 0, 0, 0.2,0) # only "explode" the 2nd slice (i.e. 'Hogs')
color={'grey', 'orange','red','c','pink'}
fig1, ax1 = plt.subplots()
ax1.pie(sizes, explode=explode, colors=color,labels=labels, autopct='%1.1f%%',
shadow=False, startangle=90,radius=1)
ax1.axis('equal') # Equal aspect ratio ensures that pie is drawn as a circle.
plt.savefig('plot5.png', dpi=300, bbox_inches='tight')
plt.show()

ii)Nested Pie Plot:
It is also known as a multi-level pie chart or plot. It is used to display multiple series of a pie chart with the help of a single visualisation.

import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()size = 0.3
vals = np.array([[60., 32.], [37., 40.], [29., 10.]])
cmap = plt.get_cmap("tab20c")
outer_colors = cmap(np.arange(3)*4)
inner_colors = cmap(np.array([1, 2, 5, 6, 9, 10]))
ax.pie(vals.sum(axis=1), radius=1, colors=outer_colors,
wedgeprops=dict(width=size, edgecolor='w'))
ax.pie(vals.flatten(), radius=1-size, colors=inner_colors,
wedgeprops=dict(width=size, edgecolor='w'))
ax.set(aspect="equal", title='Nested Pie plot')
plt.show()

i)Box Plot:
Box Plot is one of the most important plots used in machine learning. It is used in Exploratory Data Analysis(EDA) to show the distribution of numerical data and skewness through displaying the minimum & maximum score, quartiles(or percentiles) e.t.c

Understanding Box-Plot

i)Basic Box Plot:
It can be plotted using matplotlib.pyplot.boxplot() function.

matplotlib.pyplot.boxplot(x,patch_artist=False,notch=False)

import matplotlib.pyplot as plt
import numpy as np
values = np.random.normal(100, 10, 200)

box_plot_data=[values]
plt.boxplot(box_plot_data,patch_artist=False)
plt.show()

-Notched Box Plot

import matplotlib.pyplot as plt
import numpy as np
values = np.random.normal(100, 10, 200)

box_plot_data=[values]
plt.boxplot(box_plot_data,patch_artist=False,notch=True)
plt.show()

ii)Multiple Box Plots

a)Unfilled Box Plots

import matplotlib.pyplot as plt

value1=[82,76,24,40,67,62,75,78,71,32,98,89,78,67,72,82,87,66,56,52]
value2=[62,5,91,25,36,32,96,95,3,90,95,32,27,55,100,15,71,11,37,21]
value3=[23,89,12,78,72,89,25,69,68,86,19,49,15,16,16,75,65,31,25,52]
value4=[59,73,70,16,81,61,88,98,10,87,29,72,16,23,72,88,78,99,75,30]

box_plot_data=[value1,value2,value3,value4]
plt.boxplot(box_plot_data,patch_artist=True,labels=['course1','course2','course3','course4'])plt.show()
Unfilled Box Plot

b)Box Plots filled with the same colour

Box Plot filled with the same colour

c)Box Plots filled with different colours

import matplotlib.pyplot as plt

value1=[82,76,24,40,67,62,75,78,71,32,98,89,78,67,72,82,87,66,56,52]
value2= [62,5,91,25,36,32,96,95,3,90,95,32,27,55,100,15,71,11,37,21]
value3=[23,89,12,78,72,89,25,69,68,86,19,49,15,16,16,75,65,31,25,52]
value4=[59,73,70,16,81,61,88,98,10,87,29,72,16,23,72,88,78,99,75,30]

box_plot_data=[value1,value2,value3,value4]
box=plt.boxplot(box_plot_data,patch_artist=True,labels=['course1','course2','course3','course4'], )
colors = ['orange', 'c', 'grey', 'pink']
for patch, color in zip(box['boxes'], colors):
patch.set_facecolor(color)

plt.show()

k)Violin Plot:
Violin Plot is similar to the box plot but in addition, it shows the probability distribution of data at different values.

It can be plotted using matplotlib.pyplot.violinplot() function.

matplotlib.pyplot.violinplot(dataset, positions=None, vert=True, widths=0.5, showmeans=False, showextrema=True, showmedians=False, points=100, bw_method=None, *, data=None)

i)Basic Violin Plot

import matplotlib.pyplot as plt
import numpy as np
np.random.seed(10)
collectn_1 = np.random.normal(100, 10, 200)
# Create a figure instance
fig = plt.figure()
# Create an axes instance
ax = fig.add_axes([0,0,1,1])
# Create the violin plot
ax.violinplot(collectn_1,vert=True)

plt.show()

Violin Plot with Box Plot

import matplotlib.pyplot as plt
import numpy as np
np.random.seed(10)
collectn_1 = np.random.normal(100, 10, 200)
## combine these different collections into a list
data_to_plot = [collectn_1]
# Create a figure instance
fig = plt.figure()
# Create an axes instance
ax = fig.add_axes([0,0,1,1])
# Create the boxplot
ax.violinplot(collectn_1 ,vert=True,showmedians=False)
ax.boxplot(data_to_plot,patch_artist=False)
ax.annotate('Violin Plot',(1.135,110))
ax.annotate('Box Plot',(1.09,100))
plt.title('Violin Plot with Box Plot')

plt.show()

ii)Multiple Violin Plots

import matplotlib.pyplot as pltnp.random.seed(10)
collectn_1 = np.random.normal(100, 10, 200)
collectn_2 = np.random.normal(80, 30, 200)
collectn_3 = np.random.normal(90, 20, 200)
collectn_4 = np.random.normal(70, 25, 200)
## combine these different collections into a list
data_to_plot = [collectn_1, collectn_2, collectn_3, collectn_4]
# Create a figure instance
fig = plt.figure()
# Create an axes instance
ax = fig.add_axes([0,0,1,1])
# Create the violinplot
bp = ax.violinplot(data_to_plot)
plt.show()

l)Marginal Plot:
In general, a marginal plot is a scatter plot (at the centre )that has histograms, box plots, dot plots or density plots in the margins of the x- and y-axes. It is used to display the relationship between two variables and examine their distributions.

i)Marginal histogram plot

import numpy as np
import matplotlib.pyplot as plt
# Fixing random state for reproducibility
#np.random.seed(19680801)
# the random data
x = np.random.randn(1000)
y = np.random.randn(1000)
x = np.random.randn(1000)
y = np.random.randn(1000)
# definitions for the axes
left, width = 0.1, 0.65
bottom, height = 0.1, 0.65
spacing = 0.005
rect_scatter = [left, bottom, width, height]
rect_histx = [left, bottom + height + spacing, width, 0.2]
rect_histy = [left + width + spacing, bottom, 0.2, height]
# start with a rectangular Figure
plt.figure(figsize=(8, 8))
ax_scatter = plt.axes(rect_scatter)
ax_scatter.tick_params(direction='in', top=True, right=True)
ax_histx = plt.axes(rect_histx)
ax_histx.tick_params(direction='in', labelbottom=False)
ax_histy = plt.axes(rect_histy)
ax_histy.tick_params(direction='in', labelleft=False)
# the scatter plot:
ax_scatter.scatter(x, y,c='r',alpha=0.3)
# now determine nice limits by hand:
binwidth = 0.25
lim = np.ceil(np.abs([x, y]).max() / binwidth) * binwidth
ax_scatter.set_xlim((-lim, lim))
ax_scatter.set_ylim((-lim, lim))
bins = np.arange(-lim, lim + binwidth, binwidth)
ax_histx.hist(x, bins=bins,color='r',edgecolor='black',alpha=0.6)
ax_histy.hist(y, bins=bins, orientation='horizontal',color='r',edgecolor='black',alpha=0.6)
ax_histx.set_xlim(ax_scatter.get_xlim())
ax_histy.set_ylim(ax_scatter.get_ylim())
plt.show()

ii)Marginal Box Plot

import pandas as pd
from matplotlib import pyplot as plt
import numpy as np
x_data = np.random.randn(100)
y_data = -x_data + np.random.randn(100)*0.5
df = pd.DataFrame()
df['vcnt'] = x_data
df['ecnt'] = y_data
left = 0.1
bottom = 0.1
top = 0.8
right = 0.8
main_ax = plt.axes([left,bottom,right-left,top-bottom])
# create axes to the top and right of the main axes and hide them
top_ax = plt.axes([left,top,right - left,1-top])
plt.axis('off')
right_ax = plt.axes([right,bottom,1-right,top-bottom])
plt.axis('off')
main_ax.plot(df['vcnt'], df['ecnt'], 'ko', alpha=0.5)
# Save the default tick positions, so we can reset them..
tcksx = main_ax.get_xticks()
tcksy = main_ax.get_yticks()
right_ax.boxplot(df['ecnt'], positions=[0],widths=1.)
top_ax.boxplot(df['vcnt'], positions=[0], vert=False,widths=1.)
main_ax.set_yticks(tcksy) # pos = tcksy
main_ax.set_xticks(tcksx) # pos = tcksx
main_ax.set_yticklabels([int(j) for j in tcksy])
main_ax.set_xticklabels([int(j) for j in tcksx])
main_ax.set_ylim([min(tcksy-1),max(tcksy)])
main_ax.set_xlim([min(tcksx-1),max(tcksx)])
# set the limits to the box axes
top_ax.set_xlim(main_ax.get_xlim())
top_ax.set_ylim(-1,1)
right_ax.set_ylim(main_ax.get_ylim())
right_ax.set_xlim(-1,1)

plt.show()

m)Stem Plot:
This plot is also known as the Stem and Leaf Plot. It is used to display quantitative data, generally from small data sets(50 or fewer observations). It shows the absolute frequency in different classes similar to the frequency distribution table or a histogram.

It can be plotted using matplotlib.pyplot.stem() function.

matplotlib.pyplot.stem(x,y, linefmt=None, markerfmt=None, basefmt=None, bottom=0, label=None, use_line_collection=False, data=None)

i)Simple Stem Plot

# import matplotlib.pyplot library 
import matplotlib.pyplot as plt
data = [16, 25, 47, 56, 23, 45, 19, 55, 44, 27,24,23,12,33,32,14,65,43,41,43,21,10,15,13,65]# separating the stem parts
stems = [1, 1, 2, 2, 2,3,3,3,3,3,3, 4, 4, 4, 5, 5,6,6,6,7,7,8,8,9,9]
plt.ylabel('Data') # for label at y-axisplt.xlabel('stems') # for label at x-axisplt.xlim(0, 10) # limit of the values at x axisplt.stem(stems,data,use_line_collection=True,markerfmt='C3o',linefmt='C7-',basefmt='C7-') # required plot

plt.show()

ii)Plotting different functions with Stem Plot

a) exp(sin(x))

import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0.1, 2 * np.pi, 41)
y = np.exp(np.sin(x))
plt.stem(x,y,use_line_collection=True,markerfmt='C7o',linefmt='C7', basefmt='C7--')plt.show()

b)exp(sin(x)) and exp(cos(x))

import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0.1, 2 * np.pi, 45)
y = np.exp(np.sin(x))
z = np.exp(np.cos(x))
plt.stem(x, y, use_line_collection=True,markerfmt='C7o',linefmt='C7-',basefmt='C7--')
plt.stem(x, z, use_line_collection=True,markerfmt='C6o',linefmt='C6-',basefmt='C6o')

plt.show()

n)Step Plot:
A Step chart is a Line chart that does not use the shortest distance to connect two data points. Instead, it uses vertical and horizontal lines to connect the data points in a series forming a step-like progression.

It can be plotted using matplotlib.pyplot.step() function.

matplotlib.pyplot.step(x, y, label=None,color=None,where='pre', data=None, **kwargs)

i)Simple Step Plot

import numpy as np
import matplotlib.pyplot as plt
x = np.array([2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,2018,2019,2020])
y = np.array([5267,6227,3288,7222,2342,5233,7678,8735,4920,2000,6839,4352,6034,5029,2339])
plt.step(x, y, label='Salary',color='r')
plt.plot(x, y, 'C0o', alpha=0.5,color='black')
plt.xlabel('Year')plt.ylabel('Salary in (in Rs)')
plt.title('Step Plot')
plt.legend(title='Parameter where:')

plt.show()

corresponding line Plot

import numpy as np
import matplotlib.pyplot as plt
x = np.array([2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,2018,2019,2020])
y = np.array([5267,6227,3288,7222,2342,5233,7678,8735,4920,2000,6839,4352,6034,5029,2339])
plt.plot(x, y, 'C0o', alpha=0.5,color='black')plt.plot(x, y, alpha=0.5,color='c')plt.xlabel('Year')
plt.ylabel('Salary in (in Rs)')
plt.title('Line Plot')

plt.show()
Line Plot Vs Step Plot

ii)Multiple Step Plots

import numpy as np
import matplotlib.pyplot as plt
x = np.arange(14)
y = np.sin(x / 2)
plt.step(x, y + 2, label='pre (default)',color='r')
plt.plot(x, y + 2, 'C0o', alpha=0.5,color='black')
plt.step(x, y + 1, where='mid', label='mid',color='black')
plt.plot(x, y + 1, 'C1o', alpha=0.5,color='orange')
plt.step(x, y, where='post', label='post',color='orange')
plt.plot(x, y, 'C2o', alpha=0.5,color='black')
plt.legend(title='Parameter where:')

plt.show()

o)Contour Plot:
Contour plots (also known as Level Plots) are used to show a three-dimensional surface on a two-dimensional plane. It is useful in multivariate data analysis.

One variable is represented on the horizontal axis and a second variable is represented on the vertical axis. The third variable is represented by a colour gradient and isolines (lines of constant value).

These plots are often useful in data analysis, especially when you are searching for minimums and maximums in a set of trivariate data.

It can be plotted using matplotlib.pyplot.contour() or contourf() function.

for unfilled contour plots:
matplotlib.pyplot.contour(x,y,z)

for filled contour plots:
matplotlib.pyplot.contourf(x,y,z)

i)Unfilled Contour plot

import matplotlib.pyplot as plt
import numpy as np
w = 4
h = 3
d = 70
plt.figure(figsize=(w, h), dpi=d)
x = np.arange(-2, 2, 0.25)
y = np.arange(-2, 2, 0.25)
x, y = np.meshgrid(x, y)
z = np.sin(x * np.pi / 2) + np.cos(y * np.pi / 3)
plt.contour(x, y, z)
plt.show()

ii)Filled Contour Plot

import numpy as np
import matplotlib.pyplot as plt
xlist = np.linspace(-3.0, 3.0, 100)
ylist = np.linspace(-3.0, 3.0, 100)
X, Y = np.meshgrid(xlist, ylist)
Z = np.sqrt(X**2 + Y**2)
fig,ax=plt.subplots(1,1)cp = ax.contourf(X, Y, Z)fig.colorbar(cp) # Adding a colorbar to a plotax.set_title('Filled Contours Plot')ax.set_xlabel('x (cm)')
ax.set_ylabel('y (cm)')
plt.show()

p)Heatmap:
It is a 2-D visualisation technique in which numerical data( where individual data points contained in the matrix) are represented using different colours. It is a very useful technique for multivariate data analysis.

It can display variance across multiple variables, the correlation between them or can reveal any patterns. It can be plotted using matplotlib.pyplot.imshow().

Heatmap using numerical data

i)Simple Heatmap

import matplotlib.pyplot as plt
import numpy as np
data = np.array([[0.8, 2.4, 2.5, 3.9, 0.0, 4.0, 0.0],
[2.4, 0.0, 4.0, 1.0, 2.7, 0.0, 0.0],
[1.1, 2.4, 0.8, 4.3, 1.9, 4.4, 0.0],
[0.6, 0.0, 0.3, 0.0, 3.1, 0.0, 0.0],
[0.7, 1.7, 0.6, 2.6, 2.2, 6.2, 0.0],
[1.3, 1.2, 0.0, 0.0, 0.0, 3.2, 5.1],
[0.1, 2.0, 0.0, 1.4, 0.0, 1.9, 6.3]])
plt.imshow(data,cmap='Blues' ,interpolation='nearest')
plt.colorbar()
plt.show()

q)Time Series Plot:
A time series plot is a type of line plot where some measure of time unit is on the x-axis(a.k.a time-axis) and the variable to be measured is on the y-axis.
the time series plot is used to find patterns in the data and use the data for predictions.

i)Time Series Plot of V-Mart Retail Ltd Stock Price(NSE)

import pandas as pd 
plt.style.use('seaborn-whitegrid')
data= pd.read_csv(r"C:\Users\parit\Downloads\VMART.NS.csv")data.iloc[0:,0]=pd.to_datetime(data2.iloc[0:,0])fig = plt.figure(figsize=(10,8))
ax = plt.axes()
plt.plot(data.iloc[0:,0],data.iloc[0:,1],color='r',alpha=0.4,label='NSE: VMART')plt.xlabel('Days')
plt.ylabel('NSE Real Time Price(Currency in INR)')
plt.legend()
plt.title("V-Mart Retail Ltd" )

plt.show()

ii)Comparison between Stock Price of FRETAIL & SPENCERS

import pandas as pd
plt.style.use('seaborn-whitegrid')
data = pd.read_csv(r"C:\Users\parit\Downloads\FRETAIL.NS.csv")
data1= pd.read_csv(r"C:\Users\parit\Downloads\SPENCERS.NS.csv")
data.iloc[0:,0]=pd.to_datetime(data.iloc[0:,0])
data1.iloc[0:,0]=pd.to_datetime(data1.iloc[0:,0])
fig = plt.figure(figsize=(10,8))
ax = plt.axes()
plt.plot(data.iloc[0:,0],data.iloc[0:,1],color='darkorange',alpha=1,label='NSE: FRETAIL')plt.plot(data1.iloc[0:,0],data1.iloc[0:,1],color='black',alpha=0.6,label='NSE: SPENCERS')plt.xlabel('Days')
plt.ylabel('NSE Real Time Price(Currency in INR)')
plt.legend()
plt.title("Future Retail Ltd Vs Spencer's Retail Ltd" )

plt.show()

6.Three-Dimensional Plotting with Matplotlib
Matplotlib is extensively used for 2-D plotting but 3-D plots can also be plotted using this library. Three-dimensional plots can be plotted by importing the mplot3d toolkit, included with the Matplotlib package.

i)Simple 3-D line Plot

from mpl_toolkits import mplot3d
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()
ax = plt.axes(projection='3d')
z = np.linspace(0, 1, 100)

y=z * np.sin(20 * z)
x= z * np.cos(20 * z)

ax.plot3D(x, y, z, 'darkorange')
ax.set_title('3D line plot')

plt.show()

Matplotlib module is a very vast library, covering all matplotlib functions are not possible if I somehow write also, it will be a very lengthy blog. I tried to cover the most used functions of matplotlib in simpler ways so that people grasp the essence of matplotlib and implement matplotlib in better ways to visualize your data.

Happy Coding 😊

--

--