IT博客汇
  • 首页
  • 精华
  • 技术
  • 设计
  • 资讯
  • 扯淡
  • 权利声明
  • 登录 注册

    Filtering Rows in R Where Column Value is Between Two Values

    Steven P. Sanderson II, MPH发表于 2024-03-01 05:00:00
    love 0
    [This article was first published on Steve's Data Tips and Tricks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

    Introduction

    Filtering data frames in R is a common task in data analysis. Often we want to subset a data frame to only keep rows that meet certain criteria. A useful filtering technique is keeping rows where a column value falls between two specified values.

    In this post, we’ll walk through how to filter rows in R where a column value is between two values using base R syntax.

    Filtering with bracket notation

    One way to filter rows is by using bracket notation [] and specifying a logical vector.

    Let’s create a sample data frame:

    df <- data.frame(
      id = 1:10,
      value = c(5, 3, 6, 9, 2, 4, 7, 1, 8, 10)
    )

    We can filter df to only keep rows where value is between 5 and 8 with:

    df[df$value >= 5 & df$value <= 8,]
      id value
    1  1     5
    3  3     6
    7  7     7
    9  9     8

    This filters for rows where value is greater than or equal to 5 df$value >= 5 AND less than or equal to 8 df$value <= 8. The comma after the logical vector tells R to return the filtered rows.

    Filtering with subset()

    Another option is using the subset() function:

    subset(df, value >= 5 & value <= 8)
      id value
    1  1     5
    3  3     6
    7  7     7
    9  9     8

    subset() takes a data frame as the first argument, then a logical expression similar to the bracket notation.

    Additional examples

    We can filter on different columns and value ranges:

    # id between 3 and 7
    df[df$id >= 3 & df$id <= 7,] 
      id value
    3  3     6
    4  4     9
    5  5     2
    6  6     4
    7  7     7
    # value less than 5
    subset(df, value < 5)
      id value
    2  2     3
    5  5     2
    6  6     4
    8  8     1

    It’s also possible to filter rows outside a range by flipping the logical operators:

    # id NOT between 3 and 7
    df[!(df$id >= 3 & df$id <= 7),]
       id value
    1   1     5
    2   2     3
    8   8     1
    9   9     8
    10 10    10
    # value greater than 5  
    subset(df, value > 5) 
       id value
    3   3     6
    4   4     9
    7   7     7
    9   9     8
    10 10    10

    Summary

    Filtering data frames where a column is between two values is straightforward in R. The key steps are:

    • Use bracket notation df[logical,] or subset(df, logical)
    • Create a logical expression with & and >=, <= operators
    • Specify the column name and range of values to filter between

    I encourage you to try filtering data frames on your own! Subsetting by logical expressions is an important skill for efficient R programming.

    To leave a comment for the author, please follow the link and comment on their blog: Steve's Data Tips and Tricks.

    R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
    Continue reading: Filtering Rows in R Where Column Value is Between Two Values


沪ICP备19023445号-2号
友情链接