What is the select() function in R?
Overview
The select() function is used to pick specific variables or features of a DataFrame or tibble. It selects columns based on provided conditions like contains, matches, starts with, ends with, and so on.
Note: The
num_range(),matches(),contains(),starts_with(), andends_with()functions are some useful functions that are found in thepackage and used as filters in the dplyr R package select()function.
Syntax
select(.data, ...)
Parameter values
The select() function takes the following argument values:
.data: This can be a DataFrame, a tibble, or a lazy DataFrame.
...: These are unquoted expressions separated by commas, variable names, or expressions like x:y that can be used to select a range of values.
Return value
This function returns an object of the same type as .data.
Example
Here are four use cases of the select() function in R:
# load dplyr librarylibrary(dplyr, warn.conflict = FALSE, quietly = TRUE)# select only height feature from starwars datasetstarwars %>% select(height)
- Line 2: We load the
dplyrpackage in the program with thewarn.conflictargument set toFALSE. This doesn't show any library compatibility warning. - Line 4: We invoke the
select()function to filter the height feature from thestarwarsdataset.
# load dplyr librarylibrary(dplyr, warn.conflict = FALSE, quietly = TRUE)# select name to skin_color features.starwars %>% select(name:skin_color)
- Line 4: We select the feature columns from
nametoskin_colorfrom thestarwarsdataset.
# load dplyr librarylibrary(dplyr, warn.conflict = FALSE, quietly = TRUE)# select excluding the range name:massstarwars %>% select(!(name:mass))
- Line 4:
select(!(name:mass))only selects features that aren't common betweenname:massfrom thestarwarsdataset.
# load dplyr librarylibrary(dplyr, warn.conflict = FALSE, quietly = TRUE)#iris %>% select(!ends_with("Width"))
- Line 4:
select(!ends_with("Width"))only selects feature columns of theirisdataset whose labels don't end with the keyword"width".
Note:
%>%is a forward pipeline operator in R. It allows for command chaining and forwards one expression's results or values into the next expression.