Working with Matplotlib

Data Visualization : Heart of Project Presentation

Chamanth mvs
Nov 5 · 8 min read
Data Visualization using python

Beautiful plots and charts have power to speak out by just looking at them without deep explanation. It can be done with support of code written in the backend.

No matter, how great is backend code, clients as well as non-technical people can’t understand them. The only way to impress them is with your presentation skills technically.

In python, we have many libraries for data visualization. Among all mostly used are matplotlib, seaborn and plotly. In this part, I will discuss about matplotlib basics and it’s attributes.

Few points about Matplotlib

  1. It is the most popular plotting library in python and it was designed to have similar feel to Matlab’s graphical plotting. Matlab is also one of the programming language.

Installation

It can be installed through either anaconda or python package manager (pip)

Through anaconda as conda install matplotlib

Through pip as pip install matplotlib

What is matplotlib.pyplot and %matplotlib inline ?

Fig-1:matplotlib.pyplot and matplotlib inline

matplotlib.pyplot is collection of command style functions that make matplotlib work like Matlab.

In the Fig-1 as shown, import matplotlib and rename as plt in order to use it easily rather than writing matplotlib.pyplot everywhere we use it. matplotlib is a package and pyplot can be thought of as a module in it.

In brief packages are a way of constructing python module namespace’s by using “dotted module names”. Modules refer to a file containing python statements and definitions. We can import all the functions and definitions present in them. A file containing python code, for example: pyplot.py is called a module and its module name is “pyplot”.

pyplot is matplotlib’s plotting framework, each pyplot function makes some changes to figure. It helps in creating a figure, creates a plotting area in the figure, plots some lines in plotting area and decorates the plot with labels and many more.

%matplotlib inline, is used to generate plots within jupyter notebook without explicitly calling plt.show() function. It can be used only within jupyter or jupyter-lab notebook. If you are using spyder or any IDE, then plt.show() should be invoked to display plots.

matplotlib plots

There are two ways of creating matplotlib plots

(i) functional method — we directly use functions with help of pyplot (plt)

(ii) object-oriented method — we create object for each matplotlib figure and use object to modify figure or to create subplot. This is recommended method to use.

Dataframe (which will be used throughout this blog)

Fig-2:creating a dummy dataframe

In Fig-2: We are creating a dummy dataframe, which will be used throughout this blog. I haven’t used any in-built dataframe because it is difficult to find linear relationship in publicly available datasets. Initially it is better to have linear relationship in the data in order to understand the concept easily.


Creating plots

Fig-3:using plot function (which takes two arguments x and y)

There is a plot function within pyplot module, which usually accepts x-axis and y-axis. The input to both can be list or series or column of dataframe. In Fig-3, it can be clearly seen both x-axis and y-axis are provided for plot function (which plots y-axis with respect to x-axis).

Fig-4:providing only one column to plot

If we provide only one list or series, then it will consider it as y-axis by considering index as x-axis. The same can be seen in Fig-4. In which, index of dataframe is plotted on x-axis.

Adding labels to the plots

Fig-5:plot without labels VS plots with labels

To the left of Fig-5, it is not very clear about what is represented on x-axis and y-axis. But on the right of it, it is very informative and meaningful.

The same applies to Fig-6 also, This plot clearly depicts about how much labels are important to plots.

Fig-6:plot without labels VS plots with labels

Creating subplots

Fig-7:creating subplots

We use subplots to place all the plots on same canvas. (Think of Canvas as) a rectangular area intended for drawing pictures or other complex layouts.

Basic syntax: plt.subplot(row_number, column_number, plot_number)

All the subplots individually as shown in Fig-7.

plt.tight_layout() — Automatically adjust subplot parameters to give specified padding which helps in avoiding overlapping between plots. It is recommended to use this function at the end of your plotting code.


matplotlib with object-oriented method

Initially figure object should be instantiated and then all the methods or attributes are to be called from that object. It is the most efficient way and gives you more control while using matplotlib.

Fig-8: matplotlib figure object and adding axes

In the Fig-8, it can be seen that “Figure” object has been created and (this object can be thought as an imaginary blank canvas).Certain axes should be added to this canvas in order to plot. add_axes() method must be provided with dimensions [left, bottom, width, height] of the new axes. All
quantities are in fractions of figure width and height i.e., any value should be given in percentages. For example: add_axes([0.1,0.1,0.8,0.8]) means 10% in from the left, 10% up from the bottom, it takes 80% of canvas size and width, 80% of canvas size and height. Empty plot is generated in the image as no values are provided within the plot() method.

Adding labels

Fig-9:adding labels to the plot

In this method, labels can be added with the help of axes. As shown in the figure, we need to set labels for each axis as well as for the title to generate a meaningful plot.

using subplots() instead of add_axes()

Fig-10:creating subplots after creating add_axes()

In order to create subplots using add_axes(), we need to define axes for each subplot manually and then adjust the axes and create it as shown in the Fig-10. There is another way of creating subplot which is given below.

Fig-11: creating subplots directly using plt.subplots() without defining axes

plt.subplots() is the easiest way to create subplots rather than manually defining axes and fixing them. As shown in the Fig-11, we need to just define how many plots to be created (in the form of no.of.rows and no.of.columns). Here axes can be thought of as an array of axes objects (something like a list), which is iterable. This axes can also be indexed. figsize() is a customization parameter, which helps you to define the size of the figure.

Few customization parameters

matplotlib gives complete control over the figure. Customization to each plot can be done in the form of defining figsize, dpi , saving figure in required format, creating legends, defining line styles, defining colors to each line and many more.

figsize()

figsize() is parameter usually provided at the time of figure object creation as shown in Fig-11. It accepts a tuple of width and height- figsize(width,height)

savefig() and dpi

Fig-12:savefig and dpi

plots which are created can be saved to working directory using savefig() method. Extensions such as (.jpeg,.png,.jpg,etc..) should be defined while saving the plot. dpi (dots per inch), the resolution in which image should be saved can also be defined within this function.

legend()

Fig-13:adding legend to the plot

In order to add legend to the plot, .legend() function should be used. Within the plot() function, label should be defined. If forgotten, legend() function returns warning. In order to avoid overlapping of legend on the plots, loc attribute should be used as shown in Fig-13. loc=0 , specifies the best location that matplotlib places the legend on the plot. There are many loc codes, you can find them over here.

linecolor,linewidth,alpha and linestyle

Fig-14:without lw,alpha,ls vs with lw,alpha,ls

Customization of plots can be done as shown in Fig-14, in which on the left, only color parameter is used, it takes multiple arguments (such as common color strings like green,blue,red, etc.. or hex-codes). On the right of Fig-14, various parameters such as linewidth, alpha and linestyle are added to existing plot.

linewidth — defines about thickness of the line, linewidth=10 means 10 times the default linewidth (default linewidth=1).

alpha — defines the transparency of the line (that is the reason behind reduction in intensity of color between two plots).

linestyle — allows to create different types of lines (dotted lines, dash lines and any custom type). On the right of Fig-14, line is created in the form of steps using this parameter.

marker, markersize, markerfacecolor, markeredgewidth, markeredgecolor

Fig-15: marker and related attributes.

marker will help us identify exact point at the intersection between x and y axis. Type of representation of marker can be defined (circle or * or any custom type). each marker size as well as it’s color, edge_width and edge_color can also be defined as shown in the Fig-15.


Conclusion

Most of the basics regarding matplotlib library have been discussed in this blog. Initial way to learn Data visualization in python, is to start from matplotlib as it is considered as base of all plotting libraries in python. It can create almost any type of plot.

Thank you for reading so far. I am committed to improving my style and presentation methods. So, if you have any suggestions or have something to share with, feel free to comment or contact me through linkedin here.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Chamanth mvs

Written by

I am tech enthusiast and love to keep updated myself with recent trends in technology and interested in Machine Learning, Artificial intelligence and Big Data.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade