PowerQuery Puzzle solved with R
# 137–138
Puzzles:
PQ_137: content file
PQ_138: content file
PQ_137
Our goal today is to find out some data properties in table, but unfortunatelly they are weirdly nested in cells. It doesn’t look scary because we already know that many structured are nested like JSONs or XMLs. So today we have to unfold bowels of one table and join it with the second one to get proper data. Let’s do it.
Load libraries and data
library(tidyverse)
library(readxl)
T1 = read_excel("PQ_Challenge_137.xlsx", range = "A1:B5")
T2 = read_excel("PQ_Challenge_137.xlsx", range = "A9:B21")
test = read_excel("PQ_Challenge_137.xlsx", range = "E1:H9")
Data transformation
T1_1 = T1 %>%
separate_rows(Company, sep = ";|,") %>%
mutate(Company = str_remove_all(Company, "[:space:]")) %>%
separate(Company, into = c("ID","Company"),sep = ":") %>%
mutate(ID = as.numeric(ID))
result = T1_1 %>%
left_join(T2, by = "Company") %>%
arrange(Group, Company)
Validation
identical(result, test)
#> [1] TRUE
PQ_138
We have something like tape of letters with corresponding numbers. I don’t know what is the sense and purpose of it, but I know what we need to achieve. We are gonna take letters and cutting it by 10 make them columns. Whats weirder after transposition we need letters to be by letters and number by numbers. Some cyphers are really interesting.
Lets tranform this tables as our host wanted.
Load libraries and data
library(tidyverse)
library(readxl)
input = read_excel("PQ_Challenge_138.xlsx", range = "A1:F9")
test = read_excel("PQ_Challenge_138.xlsx", range = "H1:K10")
Data transformation
result <- input %>%
group_by(group_id = (row_number() - 1) %/% 2) %>%
group_map(~ .x) %>%
set_names(seq_along(.))
a1 = result %>%
map(., ~ as_tibble(t(.))) %>%
bind_rows() %>%
drop_na() %>%
add_row(V1 = NA_character_, V2 = NA_character_)
a2_L = matrix(a1$V1, ncol=2, byrow = TRUE)
a2_D = matrix(a1$V2, ncol=2, byrow = TRUE)
a2 = bind_cols(a2_L, a2_D) %>%
as_tibble() %>%
rename(Group1 = ...1, Group2 = ...2, Value1 = ...3, Value2 = ...4) %>%
mutate(Value1 = as.numeric(Value1),
Value2 = as.numeric(Value2))
Validation
identical(a2, test)
#> [1] TRUE
We’ve seen some nice puzzles considering time changes and it shows little bit out of the box thinking. Feel free to ask, like and share.
Lets be in touch for next article about the functions.