df <- data.frame( id = 1:10, value = c(5, 3, 6, 9, 2, 4, 7, 1, 8, 10) )
Filtering data frames in R is a common task in data analysis. Often we want to subset a data frame to only keep rows that meet certain criteria. A useful filtering technique is keeping rows where a column value falls between two specified values.
In this post, we’ll walk through how to filter rows in R where a column value is between two values using base R syntax.
One way to filter rows is by using bracket notation []
and specifying a logical vector.
Let’s create a sample data frame:
df <- data.frame( id = 1:10, value = c(5, 3, 6, 9, 2, 4, 7, 1, 8, 10) )
We can filter df
to only keep rows where value
is between 5 and 8 with:
df[df$value >= 5 & df$value <= 8,]
id value 1 1 5 3 3 6 7 7 7 9 9 8
This filters for rows where value
is greater than or equal to 5 df$value >= 5
AND less than or equal to 8 df$value <= 8
. The comma after the logical vector tells R to return the filtered rows.
Another option is using the subset()
function:
subset(df, value >= 5 & value <= 8)
id value 1 1 5 3 3 6 7 7 7 9 9 8
subset()
takes a data frame as the first argument, then a logical expression similar to the bracket notation.
We can filter on different columns and value ranges:
# id between 3 and 7 df[df$id >= 3 & df$id <= 7,]
id value 3 3 6 4 4 9 5 5 2 6 6 4 7 7 7
# value less than 5 subset(df, value < 5)
id value 2 2 3 5 5 2 6 6 4 8 8 1
It’s also possible to filter rows outside a range by flipping the logical operators:
# id NOT between 3 and 7 df[!(df$id >= 3 & df$id <= 7),]
id value 1 1 5 2 2 3 8 8 1 9 9 8 10 10 10
# value greater than 5 subset(df, value > 5)
id value 3 3 6 4 4 9 7 7 7 9 9 8 10 10 10
Filtering data frames where a column is between two values is straightforward in R. The key steps are:
df[logical,]
or subset(df, logical)
&
and >=
, <=
operatorsI encourage you to try filtering data frames on your own! Subsetting by logical expressions is an important skill for efficient R programming.