Setting limits on ggplot
If you want to plot a chart with a few outliers in ggplot, you might be temped to use ylim. The problem with ylim is that it removes the data points that go beyond the limits.
For instance, if you have the following data:
data <- data.frame(x = 1:20, y = c(rnorm(19), 500))
If you plot this with no changes, this is what you get:
ggplot(data, aes(x,y)) + geom_line()
It’s kind of hard to understand what’s going on in other points other than the last, so some zooming comes in handy. If you use ylim, this is what happens:
ggplot(data, aes(x,y)) + geom_line() + ylim(-5, 5)
Granted, you get to understand what’s going on, but that last of the chart data is just gone! A better alternative is to use the ylim parameter of coord_cartesian:
ggplot(data, aes(x,y)) +
geom_line() +
coord_cartesian(ylim=c(-5, 5))
This makes it much clearer that something’s going on at the end of that chart.