Working with Matplotlib
Data Visualization : Heart of Project Presentation

Beautiful plots and charts have power to speak out by just looking at them without deep explanation. It can be done with support of code written in the backend.
No matter, how great is backend code, clients as well as non-technical people can’t understand them. The only way to impress them is with your presentation skills technically.
In python, we have many libraries for data visualization. Among all mostly used are matplotlib, seaborn and plotly. In this part, I will discuss about matplotlib basics and it’s attributes.
Few points about Matplotlib
- It is the most popular plotting library in python and it was designed to have similar feel to Matlab’s graphical plotting. Matlab is also one of the programming language.
- Matplotlib gives you control over every aspect of the figure and it works well it numpy and pandas in python. Official documentation for matplotlib can be found here.
Installation
It can be installed through either anaconda or python package manager (pip)
Through anaconda as conda install matplotlib
Through pip as pip install matplotlib
What is matplotlib.pyplot and %matplotlib inline ?

matplotlib.pyplot is collection of command style functions that make matplotlib work like Matlab.
In the Fig-1 as shown, import matplotlib and rename as plt in order to use it easily rather than writing matplotlib.pyplot everywhere we use it. matplotlib is a package and pyplot can be thought of as a module in it.
In brief packages are a way of constructing python module namespace’s by using “dotted module names”. Modules refer to a file containing python statements and definitions. We can import all the functions and definitions present in them. A file containing python code, for example: pyplot.py is called a module and its module name is “pyplot”.
pyplot is matplotlib’s plotting framework, each pyplot function makes some changes to figure. It helps in creating a figure, creates a plotting area in the figure, plots some lines in plotting area and decorates the plot with labels and many more.
%matplotlib inline, is used to generate plots within jupyter notebook without explicitly calling plt.show() function. It can be used only within jupyter or jupyter-lab notebook. If you are using spyder or any IDE, then plt.show() should be invoked to display plots.
matplotlib plots
There are two ways of creating matplotlib plots
(i) functional method — we directly use functions with help of pyplot (plt)
(ii) object-oriented method — we create object for each matplotlib figure and use object to modify figure or to create subplot. This is recommended method to use.
Dataframe (which will be used throughout this blog)

In Fig-2: We are creating a dummy dataframe, which will be used throughout this blog. I haven’t used any in-built dataframe because it is difficult to find linear relationship in publicly available datasets. Initially it is better to have linear relationship in the data in order to understand the concept easily.
Creating plots

There is a plot function within pyplot module, which usually accepts x-axis and y-axis. The input to both can be list or series or column of dataframe. In Fig-3, it can be clearly seen both x-axis and y-axis are provided for plot function (which plots y-axis with respect to x-axis).

If we provide only one list or series, then it will consider it as y-axis by considering index as x-axis. The same can be seen in Fig-4. In which, index of dataframe is plotted on x-axis.
Adding labels to the plots


To the left of Fig-5, it is not very clear about what is represented on x-axis and y-axis. But on the right of it, it is very informative and meaningful.
The same applies to Fig-6 also, This plot clearly depicts about how much labels are important to plots.


Creating subplots

We use subplots to place all the plots on same canvas. (Think of Canvas as) a rectangular area intended for drawing pictures or other complex layouts.
Basic syntax: plt.subplot(row_number, column_number, plot_number)
All the subplots individually as shown in Fig-7.
plt.tight_layout() — Automatically adjust subplot parameters to give specified padding which helps in avoiding overlapping between plots. It is recommended to use this function at the end of your plotting code.
matplotlib with object-oriented method
Initially figure object should be instantiated and then all the methods or attributes are to be called from that object. It is the most efficient way and gives you more control while using matplotlib.

In the Fig-8, it can be seen that “Figure” object has been created and (this object can be thought as an imaginary blank canvas).Certain axes should be added to this canvas in order to plot. add_axes() method must be provided with dimensions [left, bottom, width, height] of the new axes. All
quantities are in fractions of figure width and height i.e., any value should be given in percentages. For example: add_axes([0.1,0.1,0.8,0.8]) means 10% in from the left, 10% up from the bottom, it takes 80% of canvas size and width, 80% of canvas size and height. Empty plot is generated in the image as no values are provided within the plot() method.
Adding labels

In this method, labels can be added with the help of axes. As shown in the figure, we need to set labels for each axis as well as for the title to generate a meaningful plot.
using subplots() instead of add_axes()

In order to create subplots using add_axes(), we need to define axes for each subplot manually and then adjust the axes and create it as shown in the Fig-10. There is another way of creating subplot which is given below.

plt.subplots() is the easiest way to create subplots rather than manually defining axes and fixing them. As shown in the Fig-11, we need to just define how many plots to be created (in the form of no.of.rows and no.of.columns). Here axes can be thought of as an array of axes objects (something like a list), which is iterable. This axes can also be indexed. figsize() is a customization parameter, which helps you to define the size of the figure.
Few customization parameters
matplotlib gives complete control over the figure. Customization to each plot can be done in the form of defining figsize, dpi , saving figure in required format, creating legends, defining line styles, defining colors to each line and many more.
figsize()
figsize() is parameter usually provided at the time of figure object creation as shown in Fig-11. It accepts a tuple of width and height- figsize(width,height)
savefig() and dpi

plots which are created can be saved to working directory using savefig() method. Extensions such as (.jpeg,.png,.jpg,etc..) should be defined while saving the plot. dpi (dots per inch), the resolution in which image should be saved can also be defined within this function.
legend()

In order to add legend to the plot, .legend() function should be used. Within the plot() function, label should be defined. If forgotten, legend() function returns warning. In order to avoid overlapping of legend on the plots, loc attribute should be used as shown in Fig-13. loc=0 , specifies the best location that matplotlib places the legend on the plot. There are many loc codes, you can find them over here.
linecolor,linewidth,alpha and linestyle


Customization of plots can be done as shown in Fig-14, in which on the left, only color parameter is used, it takes multiple arguments (such as common color strings like green,blue,red, etc.. or hex-codes). On the right of Fig-14, various parameters such as linewidth, alpha and linestyle are added to existing plot.
linewidth — defines about thickness of the line, linewidth=10 means 10 times the default linewidth (default linewidth=1).
alpha — defines the transparency of the line (that is the reason behind reduction in intensity of color between two plots).
linestyle — allows to create different types of lines (dotted lines, dash lines and any custom type). On the right of Fig-14, line is created in the form of steps using this parameter.
marker, markersize, markerfacecolor, markeredgewidth, markeredgecolor

marker will help us identify exact point at the intersection between x and y axis. Type of representation of marker can be defined (circle or * or any custom type). each marker size as well as it’s color, edge_width and edge_color can also be defined as shown in the Fig-15.
Conclusion
Most of the basics regarding matplotlib library have been discussed in this blog. Initial way to learn Data visualization in python, is to start from matplotlib as it is considered as base of all plotting libraries in python. It can create almost any type of plot.
Thank you for reading so far. I am committed to improving my style and presentation methods. So, if you have any suggestions or have something to share with, feel free to comment or contact me through linkedin here.

