Automating Excel Tasks with R dplyr
When I first learned the basics of data wrangling with R dplyr library, I immediately got a feeling that I can apply this knowledge to my daily work for the tasks that used to be handled in Excel.
After several months of practice I noticed that I tend to spend more time in R Studio than in Excel — and definitely spend less time on some previously tedious tasks, namely:
1) Comparing datasets
2) Combining datasets
3) Adding new columns
Comparing datasets
Every now and then I need to compare two datasets by ids to see which records are appearing in both and which are different.
The old way of doing things: getting two datasets into one Excel workbook. Adding a column to match dataset A to dataset B, filtering for records that only appear in A, marking them as such. Adding a column to match dataset B to dataset A, filtering for records that only appear in B, marking them as such…
The new way of doing things (assuming we have both datasets in Excel):
# This script is created to compare two lists and returns:
# 1) What is in List A but not list B?
# 2) What is in List B but not list A?
# 3) What is in both lists?library(readxl)
library(dyplr)