Create Some Not-So-Basic Graphs

Gabriella Valdez
10 min readJan 31, 2023

--

Python’s beginner-friendly graphing using Matplotlib and Seaborn

Python is simply my favorite coding language, but there is so much more to learn. One of the major area’s that's not so focused on was data visualizing. Previously, I learned how to create user functions for creating a linear regressions, circles, and regular geometric shapes with user defined functions using Jupiter Lab but wanted to create more interesting graphs. By reviewing a couple of old notes and reading on official sites for Python, I want to share some learned visualizations that are beginner-friendly with helpful descriptions.

So, here are some visuals that are quick to draft and are more appealing than a basic line plot.

Quick Reminder:

There are some functions and methods that are used for both Matplotlib and Seaborn that may be helpful to understand before checking out the examples below, especially if python is new for you or if needing to brush up on. These are the ones used and their descriptions to briefly read:

  1. arange( ) — A function to create an array in chronological order with a specific spacing from one number to another. The numbers can be whole number or in decimal form. Three input values in this function will follow: np.arange(Starting #, Stopping #, Spacing #)
  2. axis( ) — A method used to control the shape of the figure with respects on the axes. Inside the method is a stringed “value” and usually looks like: ax.axis(‘value’)
  3. concat( ) — A function used for appending, or merging, Series and DataFrames. An important thing to note is that when axis is set to 0 it combines by row index, and if set to 1 it combines by column. An example of its basic structure: pd.concat( Dataframe, Dataframe, axis= (0 or 1) )
  4. Dictionary — Usually spelled as dict. It is a bit different from a list or a tuple but incredibly helpful in creating graphs and data frames. It can be described as a set of pairs with strings immediately followed by a colon, then by strings, a list, or numerical values. Strings on the left of the colon should be listed once, or it will only print the last one typed. A basic structure example could be: {‘name’: [list], ‘another name’: ‘string’, ‘last name’: (tuple)}
  5. grid( ) — Optional function to add lines in the background of the graph.
  6. legend( ) — A function that displays a side box describing the data being graph. The function can be changed in shape, position, font, color, and frame.
  7. random.normal( ) — For creating random values using the normal distribution. Three inputs in this function will follow: np.random.normal( # to center around (or mean), # of how far the random value can increase or decrease from the center, Size of the array).
  8. subplots( ) — One of the functions needed to graph. It has other purposes such as position of the figure on the graph’s vertical and horizontal axes, frame presence, and several others but for these examples it will be used in default: ax=subplots()

Using ?*function in question without ( )* will also give you a detailed list of how to format and examples. If it is not found, it may require np, pd, or plt before the function.

Throughout this blog the specific colors were found by a site to inspire more colorful graphs: Colorkit

Matplotlib:

To create some graphs, we will have to import, or call on, special libraries. It is necessary to import at least once prior to their purposes. A couple of libraries used for these examples are matplotlib and NumPy. The library matplotlib allows users to create basic to 3D visuals (Matplotlib). As for NumPy, it is used for creating and doing vector/array operations, including matrices and other related math functions (NumPy). Another common library is Pandas, which is used to read data. Although we can use a data frame in the following examples, they were created with arrays for different outputs and a quick re-creation for you.

  1. Stack plot

This type of graph is helpful in drawing attention for a visual breakdown of several sets over a period of time. These sets can also be written as lists and created differently. Since both methods work depending on the data you have, you may want to use a data frame. Below are codes for using lists and using a data frame.

Using Randomly Generated Values
Using a Data Frame
f#Getting started
import matplotlib.pyplot as plt
import numpy as np

#Creating array from 1 to 10
x=np.arange(1,11,1)

#Creating 6 different list in one array
y=np.random.normal(5, 1, size=(6,10))

#Creating the figure above
figure, ax = plt.subplots() #needed to create graph

#Using a list to color y
#Starts with first to last list in y
addcolor=['#798b9e','#a5b1be','#084c80','#6493bd','#bae0fb','#6F63A3']

ax.stackplot(x, #x gives the graph values on the x-axis
y, #y gives lists(each line) to plot on tbe graph
colors=addcolor) #colors is optional

plt.show() #needed to display figure

####################################

#Creates a data frame with 2 columns and 4 rows
example=pd.DataFrame({'a':[2,3,1,3],'b':[1,2,2,3],'c':[3,2,2,1]})

x=[1,2,3,4]

figure, ax = plt.subplots() #needed to create graph

addcolor=['#336783','#8b8a90','#deab99']

ax.stackplot(x,example['a'],example['b'],example['c']
,colors=addcolor)

plt.show() #needed to display figure

2. The Stem plot

It's quite like the simple line plot, but the points are separated by thin lines with a shape on top called Markers. For some other graphs, markers also work in the function. Every line in stem plot can be changed in color including the base line.

Some shapes to try: D- Diamond, o- circle, s- square, v- downward triangle, X- thick x marks, p-pentagon, for more shapes run ?plt.plot and see the Markers list.

import matplotlib.pyplot as plt
import numpy as np

#Creating a sine wave for a smooth shape
x=np.arange(0 , 3*np.pi, .4)
y=np.sin(x/2) + 1

#Creating figure
figure,ax= plt.subplots()
#Adding color is optional
ax.stem(x,y,'darkolivegreen', #color for the stem lines (vertical lines)
markerfmt='gh', #color:g= green, shape:h=hexagon
basefmt='darkkhaki' #base line color
)

#Labels for x and y axis
plt.xlabel('x-axis')
plt.ylabel('y-axis')

#Adding grid lines, it is optional
plt.grid()

plt.show()

3. Scatter plot

For a more statistical focus in your data, the scatter plot marks just the points with no line. Scatter plots can be a comparison of one or a couple of variables. The example below, shows the correlations of three pet’s age and height with randomly generated numbers. Each are varied in size and color to make the graph easier to compare differences and similarities of the variables.

import matplotlib.pyplot as plt
import numpy as np

#Creating arrays
x=np.arange(1,21,1)

#Short cut for creating random arrays for multiple variables
#3 variables 3 lists with 20 random elements each
y,h,p= np.random.normal(5, 1, size=(3,20))

#Titles for variables y, h, and p
names= ['Rabbits','Dogs', 'Cats']

#Notes for optional changes of the plotted points
#s means size
#alpha means transparency
#edgecolors means the outer lining color of each point

#Creating figure
#plotting more than one will all show on a single graph
plt.scatter(x, y, color='lightseagreen',s=180,alpha=.4,edgecolors='grey')
plt.scatter(x, h, color='crimson',s=250,alpha=.45,edgecolors='grey')
plt.scatter(x, p, color='darkslateblue',s=480,alpha=.4,edgecolors='grey')

#Adding legend box(optional)
plt.legend(names #the list of names to based the legend on
loc= 1, #positions the legend at the top right, numbers range from 0 to 10
framealpha=1.0)#the level of transparence, from 0 to 1

plt.grid()
plt.xlabel('Age')
plt.ylabel('Height (inches)')

plt.show()

4. Horizontal & Grouped Bars

Instead of a vertical bar chart, a horizontal version is just as fitting. For those with more data to be shared, both vertical and horizontal can be grouped like below. This was quite new to me and required some learning from Matplotlib official’s site for horizontal and grouped bars.

If you prefer the regular vertical bar, simply follow the same format as the code below, but replace barh with bar and replace height with width.

To keep the beginner- friendliness, the legend is created a bit more manually and is done by lists inside the function. One of the other options is to use legend(handles, labels) but using lists is easier and less work since there are only two colors.

First Graph
Second Graph
import matplotlib.pyplot as plt

#Create arrays
teams=['a','b','c','d']
y=[10,23,34,45]
w=[5,20,30,40]

#Notes on creating the bar's position
#Height is to position the bars in the figures above
#Align is to line bars with respect to the tick mark
#Line style is the boarder of the bars (optional)
#Edgecolor is the color of the boarder (optional)

#Creating first graph
y2=plt.barh(teams,y, color = '#5da4cd',height= -.4, #darker blue bars
linestyle='-',edgecolor= '#5877b2')

y1=plt.barh(teams,w, color = '#91c5c0',height=.4, #Blue-green bars
align='edge',
linestyle='-',edgecolor= '#5877b2')

#Adding a title on top of the graph (optional)
plt.title('Comparison')

plt.xlabel('Avg. Games Won')
plt.ylabel('Team')

plt.grid(alpha=.4, #changes the lines transparency
axis='x') #Only using vertical lines, axis can be set to 'y' or 'both'

plt.legend([y1, y2], #list of barh
['Year 1', 'Year 2']) #list of names for barh

############################# Use seperate cell

#Creating second graph
y2=plt.barh(x,y, color = '#5da4cd',height= -.4, #darker blue bars
align='edge',
linestyle='-',edgecolor= '#5877b2')

y1=plt.barh(x,w, color = '#91c5c0',height=.4, #Blue-green bars
align='edge',
linestyle='-',edgecolor= '#5877b2')

#Adding a title on top of the graph (optional)
plt.title('Comparison')

plt.xlabel('Avg. Games Won')
plt.ylabel('Team')

plt.grid(alpha=.4, #changes the lines transparency
axis='x') #Only using vertical lines, axis can be set to 'y' or 'both'

plt.legend([y1, y2], #list of barh
['Year 1', 'Year 2']) #list of names for barh

5. Pie chart

A classic pie chart is always a nice visual to show the proportions of each variable. This common visual is a bit lengthy compared to all the examples above but definitely worth it.

Creating the pie chart may look messy so let's break it down a bit: Starting with arrays, x is a list of numbers that will be proportioned to each slice. The pull_slice list is the amount of distance from the center for the explode parameter. The position of the explode will start at 0 degrees, but it can start on another degree by using startangle= (any number between 0 to 360). In axis recommend the ‘equal’ value for a circular shape and more centered position.

Other parameters are inside ax.pie() such colors, radius, center, and wedgeprops effect how the pie chart looks. Using a dictionary, we can alter how thick linewidth(boarder) of each slice is and add color. For autopct, it shows the percentage of each slice.

import matplotlib.pyplot as plt
import numpy as np

#Creating array
x = [5,10,20,25,40]

#Titles for array
names = ['Pencils','Erasers','Highlighters','Graphing paper', 'Pens']

#Coloring and adjusting pie slices
addcolor= ['#4acead','#ffbaa2','#c49ff4','#9cbf60','#ff8180']
pull_slice= [.5,.2,.01,.05,0] #Starts at 0 degrees (teal slice) and moves counterclockwise

#Creating Figure
figure, ax = plt.subplots()

ax.pie(x, colors=addcolor,explode=pull_slice,
radius=4, #How large the pie chart is from the center to edge
center=(4, 4), #Center positions pie chart in graph
wedgeprops={"linewidth": 1, "edgecolor": "white"}, #use dictionary to edit boarder of slices
Frame=None,autopct='%1.1f%%')

ax.lengend(names)
ax.axis('equal')

plt.title('Office Supplies', #Displayed name
loc='left', #Location of name
fontdict={'fontsize':20, #Using dict for font size and font color
'color':'#2c4875'})

plt.show()

Seaborn:

The most visually appealing graphing library is Seaborn for its detail in various types of graphs. At first glance the graphs look like they would be more complex compared to the prior example, yet it is quite simpler. For Seaborn, it would be easier if your data is in a data frame format rather than vectors exampled above.

Additionally, if you like the gridlines from Seaborn, it is usable in the matplotlib graphs as well. Just import Seaborn and write sns.set_theme() before plt.show().

6. Regression plots

If you are new to these graphs, you may be wondering “well, what’s difference from the scatterplot and these regression plots?” They are quite alike, the left figure is a scatterplot, but it can also be a line plot by removing hue and adding kind=’line’. Similarly, the right figure is a scatterplot with a linear regression model that estimates the line of best fit for the data.

import seaborn as sns
import pandas as pd
import numpy as np
sns.set_theme() #Shows grid lines

#Left Figure
#Creating random data
exampleL=pd.DataFrame(np.random.normal(5, .5, size=(201,2)))

exampleL.columns=['hour','temp'] #naming the 2 columns

sns.relplot(data=exampleL,x='hour',y='temp',hue='temp')

plt.title('Temperatures Across the State',
fontsize=15) #Changing size of title

############################# Use seperate cell

#Right Figure
#Creating random data
exampleR=pd.DataFrame(np.random.normal(20, 2, size=(20,2)))

exampleR.columns=['Day','Costs'] #naming the 2 columns

sns.lmplot(data=exampleR,x='Day',y='Costs')

plt.title('Expected Costs',
fontsize=15) #Changing size of title

7. Bar

Lastly, is a normal bar graph. Seaborn’s aesthetic palettes and gridlines really makes a difference compared to a simple vertical bar graph. Using Seaborn’s official site, you can see how this graph’s color gradient is created and other adjustments.

A couple of other gradient palettes: “blend:#7AB,#EDA” for a blue to yellow, and “ch:s=.25,rot=-.25” for a blue to dark blue.

import seaborn as sns
import pandas as pd
import numpy as np

#Create dataframes
x=pd.DataFrame(np.arange(1,11,1))
a=pd.DataFrame(np.arange(50,150,10))

#Renaming data frames
x.columns=['hour']
a.columns=['temp']

#Combine them into one dataframe
example=pd.concat([x,a],axis=1)

#Creating figure
sns.set_theme()

fig,ax=plt.subplots() #figure and fig both work

sns.barplot(x='hour',y='temp',data=example, palette='flare')

plt.title('Temperatures in Desert',
fontdict={'fontsize':15,
'color':'#310118'})

Additional Tips:

  • Clicking the lower side bar will condense the output (result of code), which is very helpful if you want to keep your ?Help information without the clutter.
  • If you are using data frames in matplotlib, and want to plot all the columns try dataframe.index
  • When using the code for regression plots and horizontal and group bars, separate the creating figure in different cells. It will try to plot on the same graph or may give you an error.
  • Used in help creating the examples above and has many more examples to refer to: seaborn: statistical data visualization — seaborn 0.12.2 documentation (pydata.org) and Plot types — Matplotlib 3.6.3 documentation

--

--