IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    Data Frame Merging in R (With Examples)

    Steven P. Sanderson II, MPH发表于 2024-04-08 04:00:00
    love 0
    [This article was first published on Steve's Data Tips and Tricks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

    Introduction

    Merging multiple data frames is a pivotal skill in data manipulation. Whether you’re handling small-scale datasets or large-scale ones, mastering the art of merging can significantly enhance your efficiency. In this tutorial, we’ll delve into various methods of merging data frames in R, using straightforward examples to demystify the process.

    Understanding the Data

    Before we dive into merging data frames, let’s familiarize ourselves with the data at hand. We have a list named random_list, which comprises three samples (sample1, sample2, and sample3). Each sample consists of 50 random numbers generated from a normal distribution using the rnorm() function.

    random_list <- list(
      sample1 = rnorm(50), 
      sample2 = rnorm(50), 
      sample3 = rnorm(50)
      )

    Method 1: Using cbind() and rbind()

    One approach to merge data frames is by combining them column-wise using cbind() or row-wise using rbind().

    # Creating data frames from the list
    df1 <- data.frame(ID = 1:50, Value = random_list$sample1)
    df2 <- data.frame(ID = 1:50, Value = random_list$sample2)
    df3 <- data.frame(ID = 1:50, Value = random_list$sample3)
    
    # Merging data frames column-wise
    cbined_df <- cbind(df1, df2$Value, df3$Value)
    head(cbined_df)
      ID       Value  df2$Value   df3$Value
    1  1 -0.73158290 -0.4735007 -0.22953833
    2  2 -0.42439178 -1.4862395  0.03323658
    3  3  1.45058519 -0.8133220  0.54887013
    4  4  2.17907980 -0.3940885 -1.05173127
    5  5 -0.13482633  1.6784921  0.64882226
    6  6  0.05271133  0.4767549  0.68464619
    # Merging data frames row-wise
    rbined_df <- rbind(df1, df2, df3)
    head(rbined_df)
      ID       Value
    1  1 -0.73158290
    2  2 -0.42439178
    3  3  1.45058519
    4  4  2.17907980
    5  5 -0.13482633
    6  6  0.05271133

    In the first example, cbind() combines df1, df2, and df3 column-wise, creating a new data frame combined_df. In the second example, rbind() stacks df1, df2, and df3 row-wise, appending the rows to create combined_df.

    Method 2: Using purrr::map() and data.frame()

    With the purrr package, you can efficiently merge data frames within a list using map() and data.frame().

    library(purrr)
    
    # Merging data frames within the list
    merged_list <- map(random_list, data.frame)
    
    # Combining data frames row-wise
    combined_df <- do.call(rbind, merged_list)
    head(combined_df)
                  .x..i..
    sample1.1 -0.73158290
    sample1.2 -0.42439178
    sample1.3  1.45058519
    sample1.4  2.17907980
    sample1.5 -0.13482633
    sample1.6  0.05271133

    Here, map() iterates over each element of random_list and converts them into data frames using data.frame(). Then, do.call(rbind, merged_list) combines the data frames row-wise, creating combined_df.

    Method 3: Using purrr::map_df()

    Another purrr function, map_df(), directly merges data frames within a list, producing a single combined data frame.

    # Merging data frames within the list
    combined_df <- map_df(random_list, cbind)
    head(combined_df)
    # A tibble: 6 × 3
      sample1[,1] sample2[,1] sample3[,1]
            <dbl>       <dbl>       <dbl>
    1     -0.732       -0.474     -0.230 
    2     -0.424       -1.49       0.0332
    3      1.45        -0.813      0.549 
    4      2.18        -0.394     -1.05  
    5     -0.135        1.68       0.649 
    6      0.0527       0.477      0.685 

    By employing map_df() with cbind, we merge data frames within random_list, resulting in combined_df, which is a single merged data frame.

    Encouragement to Try on Your Own

    Now that you’ve explored different methods of merging data frames in R, I encourage you to experiment with your datasets. Practice merging data frames using various columns and explore how different merge methods influence the resulting data frame. The more hands-on experience you gain, the more proficient you’ll become in data manipulation with R.

    In conclusion, merging multiple data frames in R is a foundational skill for any data analyst or scientist. By understanding the principles behind various merge methods and experimenting with real datasets, you’ll enhance your data manipulation capabilities and streamline your workflow.

    Happy coding!

    Bonus Section

    One more method of this for you and I think I like this one the best. It’s very simple and adds the name of the list item as a value in a column.

    stacked_list <- utils::stack(random_list)
    head(stacked_list)
           values     ind
    1 -0.73158290 sample1
    2 -0.42439178 sample1
    3  1.45058519 sample1
    4  2.17907980 sample1
    5 -0.13482633 sample1
    6  0.05271133 sample1

    Here is yet another method to merge data frames in R. This method is simple and effective, providing a straightforward way to combine data frames within a list.

    # Merging data frames within the list
    mapped_list <- map(random_list, \(x) data.frame(x)) |>
      list_rbind()
    head(mapped_list)
                x
    1 -0.73158290
    2 -0.42439178
    3  1.45058519
    4  2.17907980
    5 -0.13482633
    6  0.05271133
    To leave a comment for the author, please follow the link and comment on their blog: Steve's Data Tips and Tricks.

    R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
    Continue reading: Data Frame Merging in R (With Examples)


沪ICP备19023445号-2号
友情链接