Winning Teams Point Balances
Excel BI’s Excel Challenge #304 — solved in R
Defining the Puzzle
The puzzle revolves around sports team statistics and their respective weightages. Each team has matches that are either Wins (W), Draws (D), or Losses (L). These are given weightages of 1, 0, and -1, respectively. The challenge is to determine the top three teams based on their total points. For example, if the Golden State Warriors have stats represented as “38W 75D 37L”, their total points would be calculated as:
38×1+75×0+37×−1=138×1+75×0+37×−1=1
Loading Data from Excel
The puzzle data is provided in an Excel file. Typically, the data is divided into two sections: the input, which contains the data to operate on, and the test, consisting of the expected answers. In R, there are various packages available to load Excel files, such as readxl
.
library(tidyverse)
library(readxl)
library(data.table)
input = read_excel(“Winning Team.xlsx”, range = “A1:B10”)
test = read_excel(“Winning Team.xlsx”, range = “C1:C4”)
Approach 1: Tidyverse with purrr
extract_values <- function(string) {
values <- str_extract_all(string, “\\d+”)[[1]]
tibble(
wins = as.integer(values[1]),
draws = as.integer(values[2]),
loses = as.integer(values[3])
)
}
result = input %>%
mutate(
wins = map(Stat, extract_values) %>% map_dbl(“wins”),
draws = map(Stat, extract_values) %>% map_dbl(“draws”),
loses = map(Stat, extract_values) %>% map_dbl(“loses”),
points = wins * 1 + draws * 0 + loses * -1
) %>%
arrange(desc(points)) %>%
head(3) %>%
select(Teams)
Approach 2: Base R
extract_values_base <- function(string) {
values <- as.integer(regmatches(string, gregexpr(“\\d+”, string))[[1]])
c(wins = values[1], draws = values[2], loses = values[3])
}
values_list <- lapply(input$Stat, extract_values_base)
wins <- sapply(values_list, `[[`, “wins”)
draws <- sapply(values_list, `[[`, “draws”)
loses <- sapply(values_list, `[[`, “loses”)
points <- wins * 1 + draws * 0 + loses * -1
result_df <- data.frame(Teams = input$Teams, Wins = wins, Draws = draws, Loses = loses, Points = points)
top_teams_baseR <- head(result_df[order(-result_df$Points), ], 3)$Teams
Approach 3: Data.table
input_dt <- as.data.table(input)
input_dt[, c(“wins”, “draws”, “loses”) := tstrsplit(Stat, “\\D+”, type.convert=TRUE)]
input_dt[, points := wins * 1 + draws * 0 + loses * -1]
result_dt <- input_dt[order(-points)][1:3, Teams]
Validating Our Solutions
After computing the solutions using the three methods, it’s essential to validate our results against the expected answers. This ensures the accuracy and reliability of our methodologies.
identical(result$Teams, test$`Answer Expected`)
# [1] TRUE
identical(result_dt, test$`Answer Expected`)
# [1] TRUE
identical(top_teams_baseR, test$`Answer Expected`)
# [1] TRUE
I hope you found this exploration into different R methodologies insightful! I’d love to hear your thoughts on these solutions. Maybe you have a more optimized solution or another unique approach? Feel free to share in the comments below.