Seaborn Tutorial 🖼

Part 1

Mulbah Kallen
Analytics Vidhya
4 min readAug 16, 2019

--

Just like when we were using Matplotlib let’s go ahead and import the necessary libraries.

Now let’s import a simple dataset to work with. This will give us the introductory practice we going to need in order to tackle bigger datasets. Feel free to just google pokemon.csv and grab any pokemon dataset or use this dataset. Pokemon_Data.

Here we are using Pandas to read in our dataset and assigning it to a variable df. Feel free to change df to any variable name you prefer. This will allow us to call out dataset whenever we need to view it as well as modify our dataset to better suit our needs.

Go ahead and run the code df.head(). This will give you an idea of what you are working with.

You may have guessed that the dataset we are working with contains information on the combat stats of different Pokemon. In our previous lessons we used matplotlib solely in generating our plots. With seaborn you will notice that we are able to achieve the same visualizations with less code.

Let’s begin by plotting a scatter plot that compares the Attack and Defense stats of out Pokemon.

See that our x variable and y variable are set to the column names that we want to compare against each other for each individual Pokemon. It’s important to note that Seaborn does not have a dedicated scatter plot function rather we see Seaborns function for fitting and plotting a regression line, hence the line. Either-way we can tweak the lmplot function.

If you noticed all of our Pokemon’s Attack and Defense stats never fall beneath 0 yet our Axis limits do. Let’s go ahead and fix this.

This is a great way to visualize our data but there are plenty of other ways to visualize our dataset in order to gain a deeper understanding of how all of our battle stats compare to one another.

Box plots are used to depict groups within a dataset through their quartiles. This means that a box plot consists of 5 things.

Note that we have created a box plot but there are few things within our plot that we don’t need. Total, Generation and Legendary are all not needed for this plot because the information they contain in not related to battle stats, our primary focus.

We can achieve a better plot by creating a new data frame that no longer contains the Total, Generation and Legendary columns. This can be done by creating a new variable name and assigning it to a function that removes the unwanted columns.

Now we can plot our box plots again using the stats_df dataframe

You may have noticed that the background of out plots have always a shaded grayish (darkgrid) color but this can be changed if you so desire. Let’s go ahead and change our theme to whitegrid and also try put a new plot. A Violin plot.

You’ll notice that the x titles are overlapping. We can fix this by rotating each title by 45 degrees in order to make them all visible.

Assign the moment in which you call the violin plot to a new variable called v_plot. We are now going to call a function against this variable that will get our x labels then rotate them by 45 degrees.

Violin plots do a fantastic job at visualizing distributions.

In Part 2 we will go over a few more plots that will give you a great grasp on the variety that Seaborn has to offer.

--

--