Cummulate and concatenate
Published in
2 min readOct 31, 2023
Excel BI’s Excel Challenge #313 — solved in R
Defining the Puzzle:
Today we need to transform table into its cummulative (someway) version. Every column have to include content of column lagged by 2.
Generate the result table.
Here, Tn = T(n-2) & Tn where & is concatenation operator.
Loading Data from Excel:
Lets start loading data and libraries:
library(tidyverse)
library(readxl)
library(data.table)
input = read_excel(“Scan3.xlsx”, range = “A2:G5”, col_names = c(“X1”, “X2”, “X3”, “X4”, “X5”, “X6”, “X7”))
test = read_excel(“Scan3.xlsx”, range = “I2:O5”, col_names = c(“X1”, “X2”, “X3”, “X4”, “X5”, “X6”, “X7”))
Approach 1: Tidyverse with purrr
process_columns <- function(df) {
num_cols <- ncol(df)
if (num_cols < 3) {
return(df)
}
for (i in 3:num_cols) {
df <- df %>%
mutate(across(all_of(names(df)[i]), ~ paste0(df[[i — 2]], .)))
}
return(df)
}
result = process_columns(input)
Approach 2: Base R
process_columns_base_R <- function(df) {
num_cols <- ncol(df)
if (num_cols < 3) {
return(df)
}
for (i in 3:num_cols) {
df[[i]] <- paste0(df[[i — 2]], df[[i]])
}
return(df)
}
result_base_R = process_columns_base_R(input)
Approach 3: Data.table
process_columns_data_table <- function(df) {
setDT(df)
num_cols <- ncol(df)
if (num_cols < 3) {
return(df)
}
cols_to_modify <- names(df)[3:num_cols]
for (i in cols_to_modify) {
df[, (i) := paste0(df[[which(names(df) == i) — 2]], df[[i]])]
}
return(as_tibble(as.data.frame(df)))
}
result_dt = process_columns_data_table(input)
Validating Our Solutions:
identical(result, test)
# [1] TRUE
identical(result_base_R, test)
# [1] TRUE
identical(result_dt, test)
# [1] TRUE
If you like my publications or have your own ways to solve those puzzles in R, Python or whatever tool you choose, let me know.