What is the update() function in R?
Functions in R are reusable blocks of code that help maintain organization and prevent code repetition. Functions are either built-in (predefined functions that are available in R) or user-defined (written by the user).
The update() function is a built-in function used to modify or refit specific components of a model while retaining the structure and attribute of the original object. By using the update() function, we can easily enhance and customize an existing model without having to recreate it from scratch.
Syntax
The syntax of the update() function is as follows:
update(object, formula, ...)
object: The object that represents the existing model we wish to modify.formula: The new formula for the model....: Additional arguments.
Usage
We will explore the usage of the update() function using the following examples:
Example 1: Updating a model with a new formula
The following code shows how to update a linear model with a new formula:
# Create a vector of independent variablesx <- c(1, 2, 3, 4, 5, 6)# Create a vector of dependent variablesy <- c(42, 43, 44, 45, 43, 47)# Fit a linear modelmodel <- lm(y ~ x)# Update the model with a new formulanew_model <- update(model, y ~ x + I(x^2))# Print the model summarysummary(new_model)
Code explanation
Line 2: This creates a vector
xcontaining the independent variables with values1,2,3,4,5, and6.Line 5: This creates a vector
ycontaining the dependent variables with values42,43,44,45,43, and47.Line 8: This fits a linear model using the
lm()function, whereyis regressed onx. This means it attempts to find a linear relationship betweenxandy.Line 11: This updates the model created in line 8 using the
update()function. The updated model, stored in thenew_modelvariable, includes an additional termI(x^2). The termI(x^2)indicates that the independent variablexshould be squared before being included in the model.Line 14: This prints a summary of the
new_modelusing thesummary()function. The summary provides various statistics and information about the fitted model, including coefficient estimates, standard errors, t-values, p-values, and the overall model fit.
Example 2: Removing a variable from a model
The following code shows how to remove a variable using the update() function:
# Sample data pointsx <- c(1, 2, 3, 4, 5)y <- c(2, 4, 6, 8, 10)z <- c(3, 6, 9, 12, 15)# Create a data frame using the sample data pointsdata <- data.frame(x, y, z)# Fit a linear regression modelmodel <- lm(y ~ x + z, data = data)#Print the summary of the initial modelsummary(model)# Remove the variable `z` from the modelreduced_model <- update(model, y ~ x - z)# Print the summary of the reduced modelsummary(reduced_model)
Code explanation
Lines 2–4: These lines define sample data points for the variables
x,y, andz.Line 7: This creates a data frame named
datausing thedata.frame()function, wherex,yandzare assigned as columns of the data frame.Line 10: This fits a linear regression model using the
lm()function. The formulay ~ x + zspecifies thatyis the response variable, andxandzare the predictor variables, using the data from thedatadata frame.Line 13: This line uses the
summary()function to obtain a summary of the initial model and provide statistical information about the model's coefficients, standard errors, p-values, and goodness-of-fit measures.Line 16: This line uses the
update()function to create a reduced model by removing the variablezfrom the original model. The updated formula,y ~ x - zindicates thatzshould be excluded from the model.Line 19: Finally, the
summary()function is used to print the summary of the reduced model's statistical information. This allows us to analyze how removing a variable can affect the model.
Example 3: Updating a model with new data
The following code shows how to update a linear model with new data:
# Create a vector of independent variablesx <- c(1, 2, 3, 4, 5, 6)# Create a vector of dependent variablesy <- c(42, 43, 44, 45, 43, 47)# Fit a linear modelmodel <- lm(y ~ x)# Create a new data frame with new observationsnew_data <- data.frame(x = c(11, 12, 13, 14, 15, 16), y = c(52, 54, 56, 58, 60, 61))# Update the model with the new dataupdated_model <- update(model, data = new_data)# Print the model summarysummary(updated_model)
Code explanation
Line 11: This creates a new data frame named
new_datausing thedata.frame()function. The data frame has two variables:xandy. Thexvariable contains the values11,12,13,14,15, and16, and theyvariable contains the values52,54,56,58,60, and61. Essentially, it creates a new set of observations for the independent variablexand the dependent variabley.Line 14: This updates the previously created model
model, using theupdate()function. The updated model, stored in theupdated_modelvariable, incorporates the new data from thenew_datadata frame. By specifying thedata = new_dataargument, the model is adjusted to consider the additional data points in thenew_datadata frame.Line 17: The
summary()function is applied to the updated model to obtain a summary of the model’s statistical information based on the updated data. This allows us to analyze how the model’s parameters and statistical measures change when fitting the model to a different dataset.