Excel BI’s Excel Challenge #315 — solved in R
Unlike title suggests, today our task is not to manipulate planets, but to remove from our target texts all characters that are in the Solar System planets and Sun.
Remove the English letters appearing in Planets column from Author Column.
Lets start loading data and libraries:
library(tidyverse) library(readxl) library(data.table) planets = read_excel(“Letters Removal.xlsx”, range = “A1:A9”) input = read_excel(“Letters Removal.xlsx”, range = “B1:B10”) test = read_excel(“Letters Removal.xlsx”, range = “C1:C10”)
planet_letters = planets %>% mutate(planets_letters = map(Planets, ~strsplit(.x, "")) %>% flatten()) %>% select(-Planets) %>% unnest(planets_letters) %>% unique() PL_upper = str_to_upper(planet_letters$planets_letters) PL_lower = str_to_lower(planet_letters$planets_letters) PL = c(PL_upper, PL_lower) result = input %>% rowwise() %>% mutate(Author = str_remove_all(Author, paste0("[", paste0(PL, collapse = ""), "]")) %>% trimws()) %>% ungroup() %>% mutate(Author = if_else(str_length(Author) == 0, NA_character_, Author))
planet_letters <- unique(unlist(strsplit(as.character(planets$Planets), "", fixed = TRUE))) PL_upper <- toupper(planet_letters) PL_lower <- tolower(planet_letters) PL <- c(PL_upper, PL_lower) input$Author <- gsub(pattern = paste0("[", paste0(PL, collapse = ""), "]"), replacement = "", x = as.character(input$Author), perl = TRUE) input$Author <- trimws(input$Author) input$Author[lengths(strsplit(as.character(input$Author), NULL)) == 0] <- NA
setDT(planets) setDT(input) planet_letters <- unique(unlist(strsplit(planets$Planets, "", fixed = TRUE))) PL_upper <- toupper(planet_letters) PL_lower <- tolower(planet_letters) PL <- c(PL_upper, PL_lower) input[, Author := gsub(pattern = paste0("[", paste0(PL, collapse = ""), "]"), replacement = "", x = Author, perl = TRUE)] input[, Author := trimws(Author)] input[, Author := ifelse(nchar(Author) == 0, NA_character_, Author)]
identical(result$Author, test$`Answer Expected`)`` # [1] TRUE identical(result$Author, test$`Answer Expected`)`` # [1] TRUE identical(result$Author, test$`Answer Expected`)`` # [1] TRUE
If you like my publications or have your own ways to solve those puzzles in R, Python or whatever tool you choose, let me know.
