Data Visualization using Python Part-I

Tanvi Penumudy
Analytics Vidhya
Published in
5 min readJan 1, 2021

Data visualization is the discipline of trying to understand data by placing it in a visual context so that patterns, trends and correlations that might not otherwise be detected can be exposed.

Python offers multiple great graphing libraries such as Matplotlib and Seaborn that come packed with lots of different features, let us spend the next few minutes exploring just that!

Data is only as good as it’s presented — Source

Image Source: Pinterest

Matplotlib

Matplotlib is an amazing visualization library in Python for 2D plots of arrays. Matplotlib is a multi-platform data visualization library built on Numpy arrays and designed to work with the broader SciPy stack. It is the brainchild of John Hunter.

Matplotlib Installation

Can be done on your local machine via Python command prompt

python -m pip install -U pip
python -m pip install -U matplotlib

Importing Matplotlib

from matplotlib import pyplot as plt
#or
import matplotlib.pyplot as plt

General Concepts in Matplotlib

A Matplotlib figure can be categorized into several parts as follows —

Figure: It is a whole figure which may contain one or more than one axes (plots). You can think of a Figure as a canvas which contains plots.

Axes: It is what we generally think of as a plot. A Figure can contain many Axes. It contains two or three (in the case of 3D) Axis objects. Each axis has a title, an x-label and a y-label.

Axis: They are the number line like objects and take care of generating the graph limits.

Artist: Everything which one can see on the figure is an artist like Text objects, Line2D objects, collection objects. Most Artists are tied to Axes.

Getting Started with Pyplot

Making a simple plot

import matplotlib.pyplot as plt
import numpy as np

A few points to be noted —

  • We pass two arrays as our input arguments to Pyplot’s plot() method and use show() method to invoke the required plot.
  • Here note that the first array appears on the x-axis and second array appears on the y-axis of the plot.
  • Now that our first plot is ready, let us add the title, and name x-axis and y-axis using methods title(), xlabel() and ylabel() respectively.

We can also specify the size of the figure using method figure() and passing the values as a tuple of the length of rows and columns to the argument figsize as illustrated in the image below.

We can also plot multiple sets of data by passing in multiple sets of arguments of X and Y-axis using plot() as shown —

Different Visualizations using Pyplot

Bar Plots

Pyplot provides a method bar() to make bar graphs which take arguments: categorical variables, their values and colour (if you want to specify).

You can also make horizontal bar graphs using the method barh() Also, we can pass an argument (with its value)xerr oryerr (in case of the above vertical bar graphs) to depict the variance in our data as follows —

To create horizontally stacked bar graphs we use the bar() method twice and pass the arguments where we mention the index and width of our bar graphs in order to horizontally stack them together.

Also, notice the use of two other methods legend() which is used to show the legend of the graph and xticks() to label our x-axis based on the position of our bars.

Similarly, to vertically stack the bar graphs together, we can use an argument bottom and mention the bar graph which we want to stack below as its value.

Pie Chart

A Pie Chart can be made using the method pie() We can also pass in arguments to customize our Pie chart to show the shadow, explode a part of it, tilt it at an angle as follows —

Histogram

Histograms are a special form of bar chart where the data represent continuous rather than discrete categories. This means that in a histogram there are no gaps between the columns representing the different categories.

Histograms can be achieved using Matplotlib!

Scatter Plots

Scatter plots are widely used graphs, especially they come in handy in visualizing a problem of regression.

In the following example, we fed in arbitrarily created data of height and weight and plot them against each other. We can use xlim() and ylim() methods to set the limits of X-axis and Y-axis respectively.

3-D Plotting

The above scatter can also be visualized in three dimensions. To use this functionality, we first import the module mplot3d as follows —

from mpl_toolkits import mplot3d

Once the module is imported, a three-dimensional axes is created by passing the keyword projection='3d' to the axes() method of Pyplot module. Once the object instance is created, we pass our arguments height and weight to scatter3D() method.

We can also create 3-D graphs of other types like line graph, surface, wireframes, contours, etc. The above example in the form of a simple line graph is as follows: Here instead of scatter3D() we use method plot3D()

I know that’s a lot to take in at once! But you made it until the end! Kudos on that!

Additional Resources

Matplotlib has been around for a while and there are a lot of other good resources if you’re still interested in getting the most out of this library.

For complete code, visit the following link —

Also, do not forget to go throughMatplotlib Documentation and the Part -II of this Blog.

--

--