IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    Remove Solar System

    Numbers around us发表于 2023-10-31 13:03:10
    love 0
    [This article was first published on Numbers around us - Medium, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

    Excel BI’s Excel Challenge #315 — solved in R

    Defining the Puzzle:

    Unlike title suggests, today our task is not to manipulate planets, but to remove from our target texts all characters that are in the Solar System planets and Sun.

    Remove the English letters appearing in Planets column from Author Column.

    Loading Data from Excel:

    Lets start loading data and libraries:

    library(tidyverse)
    library(readxl)
    library(data.table)
    
    planets = read_excel(“Letters Removal.xlsx”, range = “A1:A9”)
    input = read_excel(“Letters Removal.xlsx”, range = “B1:B10”)
    test = read_excel(“Letters Removal.xlsx”, range = “C1:C10”)

    Approach 1: Tidyverse with purrr

    planet_letters = planets %>% 
      mutate(planets_letters = map(Planets, ~strsplit(.x, "")) %>% flatten()) %>%
      select(-Planets) %>%
      unnest(planets_letters) %>%
      unique()
    
    PL_upper = str_to_upper(planet_letters$planets_letters)
    PL_lower = str_to_lower(planet_letters$planets_letters)
    PL = c(PL_upper, PL_lower)
    
    result = input %>%
      rowwise() %>%
      mutate(Author = str_remove_all(Author, paste0("[", paste0(PL, collapse = ""), "]")) %>% trimws()) %>%
      ungroup() %>%
      mutate(Author = if_else(str_length(Author) == 0, NA_character_, Author))

    Approach 2: Base R

    planet_letters <- unique(unlist(strsplit(as.character(planets$Planets), "", fixed = TRUE)))
    
    PL_upper <- toupper(planet_letters)
    PL_lower <- tolower(planet_letters)
    PL <- c(PL_upper, PL_lower)
    
    input$Author <- gsub(pattern = paste0("[", paste0(PL, collapse = ""), "]"), 
                         replacement = "", 
                         x = as.character(input$Author), 
                         perl = TRUE)
    
    input$Author <- trimws(input$Author)
    input$Author[lengths(strsplit(as.character(input$Author), NULL)) == 0] <- NA

    Approach 3: Data.table

    setDT(planets)
    setDT(input)
    
    planet_letters <- unique(unlist(strsplit(planets$Planets, "", fixed = TRUE)))
    
    PL_upper <- toupper(planet_letters)
    PL_lower <- tolower(planet_letters)
    PL <- c(PL_upper, PL_lower)
    
    input[, Author := gsub(pattern = paste0("[", paste0(PL, collapse = ""), "]"), 
                           replacement = "", 
                           x = Author, 
                           perl = TRUE)]
    
    input[, Author := trimws(Author)]
    input[, Author := ifelse(nchar(Author) == 0, NA_character_, Author)]

    Validating Our Solutions:

    identical(result$Author, test$`Answer Expected`)``
    # [1] TRUE
    
    identical(result$Author, test$`Answer Expected`)``
    # [1] TRUE
    
    identical(result$Author, test$`Answer Expected`)``
    # [1] TRUE

    If you like my publications or have your own ways to solve those puzzles in R, Python or whatever tool you choose, let me know.


    Remove Solar System was originally published in Numbers around us on Medium, where people are continuing the conversation by highlighting and responding to this story.

    To leave a comment for the author, please follow the link and comment on their blog: Numbers around us - Medium.

    R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
    Continue reading: Remove Solar System


沪ICP备19023445号-2号
友情链接