Data Visualization with Python and Seaborn — Part 2: Controlling Aesthetics

Random Nerd
6 min readAug 17, 2018

Hope you have already covered Data Visualization Part-1 of this series which included Loading Built-in Datasets and other data files using Pandas library, along with managing aesthetics of our plots. With this article, we shall be comparing the lines of code required to write for a particular plot using Matplotlib and then Seaborn. This gets even more evident when trying to draw a statistical plot which is often the case in Data Science domain. And for this purpose today we shall take an example of a Linear Regression plot.

But before we get in action, let me highlight another important aspect related to aesthetics of a Seaborn plot. If you have installed Seaborn version 0.8 or above (which certainly will be the case if you have installed lately or updated Seaborn anytime after June, 2017), then the plots don’t seem to automatically separate the value representations. Let me show what I mean by plotting one such figure:

The warning above is harmless as it just shows that a particular argument (normed) has been deprecated in current version of Seaborn. But the thing to observe in this histogram is that each bar is closely attached to another and though this might be the latest enhancement by Seaborn, many professional still like the old-Matplotlib style of bar separation. Here we shall try to draw separators between each bar:

Now we have each bar visibly separated and that gets more appealing to few professionals so if you want to, then now you know how to get this done. As far as the parameters are concerned, you don’t really need to get bothered about that for now because I shall be covering these distribution plots in detail, in upcoming articles. Another important change that happened with recent version was in terms of limits of axes. Sounds confusing, right? Let me show you an example of what I mean:

You may notice in above plot that X-axis in particular doesn’t start with 0 at extreme left bottom. Many a times we would let it be this way but for some reason if we strictly want the plot to begin with 0, then it requires minor tweaking:

The list passed defines the lower and upper limits of our X-axis. If required, similarly we may set limits for our Y-axis as well by just adding another line of code: plt.ylim([lower, upper]). That pretty much covers the left overs for us in terms of aesthetics of our plot. I shall keep guiding you through next set of articles in this series with as much plot customization as I can but in a general professional atmosphere, this shall be good enough to begin with.

Now we shall look into our long awaited code length competition (Just kidding guys!). So, let us begin plotting our Matplotlib plot for Regression but before I begin, let me very clear on the point that we shall not delve into discussing Matplotlib code here because our sole agenda is only to observe the convenience we get with Seaborn. So here it is:

Although it isn’t a topic of our interest, still just for this example I shall illustrate how we could have mathematically computed factors like Slope, intercept, r-value (Correlation coefficient), p-value and Standard error of the estimate using Scipy:

Linear Regression plots in general are linear attempts to model the relationship between two variables (x and y here) by fitting a linear equation to observed data. One variable is considered to be an explanatory variable, and the other is considered to be a dependent variable. For example, a Data Analyst might want to relate the weight of individuals to their height using a linear regression model. Or just a simple linear fit that we’re trying to achieve with our example. Let’s now get this done in Seaborn:

Both the plots on respective data points are correct but what differs is the control with a single line of code. Although I must agree that SciPy has enhanced a lot of features and things look good, but Seaborn always has that edge with it’s visually appealing plots. Let me also give you a quick preview with different colors for this plot:

Before we move on to our next article, let me show a Seaborn plot against a Matplotlib plot which might be little complex at this stage but as we keep progressing on our subject, at least the Seaborn segment of it will get clearer for you. Our sole agenda would be just to view the plot and not focus on the code associated. So let me present it without any further delay:

Now let us try to do the same using Matplotlib and the difference in aesthetics should be a great morale booster for you to stay glued ahead for this statistical visualization bible with Python:

There is quite a bit of difference that we can observe in both the plots above but also let me tell you that there are immense opportunities of customization in Matplotlib code (which you should try on your own) but that would require more lines of code or say parameters. Here, Seaborn gives us flexibility to achieve the same with minimum number of lines of code and this statistical representation efficiency is what we shall be observing throughout this series.

In our next article we shall be looking into Color options that are available with Seaborn and backed by Matplotlib. There are multiple tips and tricks regarding colors that I would keep sharing in addition to next lecture, when we plot various types of Seaborn figures. Till then, Happy Visualizing!

EDIT: Here is Resource Content (List of all the Parts of this Series).

Data Visualization with Python and Seaborn — Part 1

Data Visualization with Python and Seaborn — Part 3

--

--