Automating Excel Tasks with R dplyr

Julia Belkin
4 min readFeb 23, 2022

When I first learned the basics of data wrangling with R dplyr library, I immediately got a feeling that I can apply this knowledge to my daily work for the tasks that used to be handled in Excel.

After several months of practice I noticed that I tend to spend more time in R Studio than in Excel — and definitely spend less time on some previously tedious tasks, namely:

1) Comparing datasets

2) Combining datasets

3) Adding new columns

Comparing datasets

Every now and then I need to compare two datasets by ids to see which records are appearing in both and which are different.

The old way of doing things: getting two datasets into one Excel workbook. Adding a column to match dataset A to dataset B, filtering for records that only appear in A, marking them as such. Adding a column to match dataset B to dataset A, filtering for records that only appear in B, marking them as such…

The new way of doing things (assuming we have both datasets in Excel):

# This script is created to compare two lists and returns:
# 1) What is in List A but not list B?
# 2) What is in List B but not list A?
# 3) What is in both lists?
library(readxl)
library(dyplr)

--

--

Julia Belkin

Data Analyst. BS in Economics, MBA in Strategic Management. Mostly posting about T-SQL, R, Excel, and the books I read.