Matplotlib

r.aruna devi
5 min readSep 15, 2019

--

  • Matplotlib is a popular python package for Data visualization.
  • Matplotlib makes use of Numpy. 1D array cannot be visualized. From 2D to nD can be visualized.
  • Install using pip : pip install matplotlib

Simple Example :

Types of Plots :

Let’s see how to import and practice few examples

  • from matplotlib import pyplot as plt (or) import matplotlib.pyplot as plt

x = np.linspace(-3, 3, 30)
y = x**2
plt.plot(x, y,’r+’) # red color and + symbol
plt.show()

### these are the symbols and colors available to plot
  1. Let’s see few example on creating a figure and different types of plotting.
  • create a figure instance and add axes to it…
  • add title by set_title
  • add x-axis label by set_xlabel
  • add y-axis label by set_ylabel

2. Add grids in your plots by adding grid(True)

3. Having Multiple plots: using subplot

4. Set the axis limit by set_xlim,set_ylim

## can specify the x-axis and y-axis data points

5. Markers denoting the data points on axis : TICKS: xticks,yticks

## instead of 0,2,4,6…give markers for x-axis,y-axis data points

6. Dual axis…using twinx

## give different colors to differentiate each plot and use legends to have a clear view on both the color and what it does.

Let’s discuss about types of PLOTS:

  1. BAR PLOT :
  • It represents the categorical data with rectangular bars. The bars can be either horizontal or vertical.
  • Three types of bar plot. Simple bar chart, Stacked bar plot, Multiple bar plots.
  • ax.bar(x,height,width,bottom,align=”center”, “edge”)

2. Histogram :

  • It represents the numerical data.
  • To find the number of points belong to a value.
  • Kind of bar plot.
  • a.hist(x,bins=[value1,value2….])
  • bins are number of points

3. Pie Chart :

  • Pie charts show the size of items (called wedge) in one data series, proportional to the sum of the items.
  • ax.pie(x,y,autopct=‘%1.2%%’)

4. Scatter Plot :

  • Scatter plots are used to plot data points on horizontal and vertical axis in the attempt to show how much one variable is affected by another.
  • Scatter plots are used when you want to show the relationship between two variables

5. Line Plot

  • A line plot is a graph that shows frequency of data along a number line.

6. Box Plot :

A box and whisker plot (sometimes called a boxplot) is a graph that presents information from a five-number summary. Box and whisker plots are also very useful when large numbers of observations are involved and when two or more data sets are being compared.

A box and whisker plot is a way of summarizing a set of data measured on an interval scale. It is often used in explanatory data analysis. This type of graph is used to show the shape of the distribution, its central value, and its variability.

In a box and whisker plot:

  • the ends of the box are the upper and lower quartiles
  • the median is marked by a vertical line inside the box
  • the whiskers are the two lines outside the box that extend to the highest and lowest observations.

When to use which plot ?

  1. Bar Plot :
  • When there is no continuity between X variable data values.
  • To compare things between different groups or to track changes over time.
  • Categorical values could be plotted in Bar plot.

2. Histogram :

  • Used to show distributions of variables.
  • It presents numerical data.
  • Provide a visual interpretation of numerical data by indicating the number of data points that lie within a range of values.The higher that the bar is, the greater the frequency of data values in that bin.

3. Pie chart :

  • Pie charts are less likely to be useful .
  • Because they display proportions of a whole, and when the proportions are close to each other, it can be difficult to determine if a specific slice is bigger than another.

4. Scatter plot :

  • When you want to show relationship between two variable.

5. Line plot :

  • The best use of a line graph is data that changes over time.(Time series dataset_
  • To display relationships with continuous periodical data.
  • If you have continuous data that you would like to represent through a chart then a line chart is a good option.

6. Box plot :

  • Used in EDA( explanatory data analysis )
  • To summarize data from multiple sources and display the results in a single graph.
  • The procedure to develop a box and whisker plot comes from the five statistics below.
  1. Minimum value: The smallest value in the data set
  2. Second quartile: The value below which the lower 25% of the data are contained
  3. Median value: The middle number in a range of numbers
  4. Third quartile: The value above which the upper 25% of the data are contained
  5. Maximum value: The largest value in the data set

--

--