Recently, at work, I wanted to make a heatmap showing activity by day. I thought displaying the data as a calendar would be the most useful for other team members. I typed up this tutorial a) as documentation for myself and b) for other people looking to make a similar thing.
library(tidyverse)
library(viridis)
library(lubridate)
Data source looks like:
date n2020-09-01 4
2020-09-02 1
2020-09-03 3
2020-09-05 4
2020-09-07 1
2020-09-11 506
...
First thing is to fill in the rest of the relevant dates that we have no data for. This is done with complete()
which is a really neat tidyr function. It’ll generate a sequence of dates from an input. Here, I want dates from September 1st to the end of the year.
data %>%
complete(date = seq.Date(as.Date("2020-09-01"), as.Date("2020-12-31"), by="day")) %>%
mutate(month = month(date, label = TRUE),
wday = wday(date, label = TRUE),
day = day(date),
week = epiweek(date)) -> df
After that, it’s a matter of using up lubridate functions to turn those dates into something easier to manage. You’ll want to use epiweek
because the regular week
uses a definition of a week being a 7 day period after Jan 1st. epiweek
and isoweek
are similar, but epiweek
produces a calendar more familiar to Americans.
Next, I created a tag to indicate which dates are in the future, so that the text for their dates can be in a different color. For the ggplot
portionm geom_tile
and geom_text
do most of the heavy lifting. You want the day of the week to be your X axis and the week number to be your Y axis. There’s a custom theme that was applied to this, but I’m pretty sure theme_bw()
or theme_void()
would give you similar results.
df %>%
mutate(color_tag = case_when(date > Sys.Date() ~ "1",
TRUE ~ "0")) %>%
ggplot(aes(x=wday, y = wk)) +
geom_tile(aes(fill = n), color = "black", size = .5) +
geom_text(aes(label = day, color = color_tag)) +
labs(title = "Messages by date", x = "", y = "") +
scale_fill_viridis(option = "magma",
direction = -1,
name = "# messages",
na.value = 'white') +
scale_x_discrete(position = "top") +
scale_y_continuous(trans = "reverse" ) +
scale_color_manual(values = c("black", "grey")) +
facet_wrap(~month, scales="free_y") +
guides(color = FALSE) +
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank(),
axis.ticks.x = element_blank())
Important parts:
facet_wrap(~month, scales = "free_y")
puts only the relevant bits into each month block. If you don’t dofree_y
, it’ll put every week on there and then that’s just unreadable.na.value
inscale_fill
makes sure that the days with no data still show up on the plotscale_y_continuous(trans = “reverse”)
puts the weeks in ascending order from the top, which is what we’re more used to seeing.
Congrats, you have a calendar heatmap now!