How to perform UPDATE from SELECT statement
Updating data
An UPDATE statement in SQL is to modify existing record(s) in a table. It allows us to change the values in one or more columns.
-- update tableUPDATE table_nameSET column1 = value1, column2 = value2, ...WHERE condition;
Example
Note: We will be using the following two tables to demonstrate all the examples below.
Users table
ID | Username | Birthdate | Amount_spent | |
1 | 'john_doe' | john@example.com | 1990-05-15 | 225.80 |
2 | 'jane_smith' | jane@example.com | 1998-10-22 | 341.25 |
3 | 'alex_brown' | alex@example.com | 1995-02-08 | 0 |
4 | 'emily_watson' | emily@example.com | 1992-07-30 | 0 |
5 | 'micheal_jackson' | micheal@example.com | 1985-08-29 | 0 |
6 | 'susan_walker' | susan@example.com | NULL | 0 |
7 | 'lisa_jones' | lisa@example.com | 1998-03-17 | 0 |
Orders table
order_id | user_id | order_date | total_payment |
101 | 1 | 2023-08-01 | 150.00 |
102 | 2 | 2023-08-02 | 220.50 |
103 | 1 | 2023-08-03 | 75.80 |
104 | 3 | 2023-08-04 | 350.00 |
105 | 4 | 2023-08-05 | 45.25 |
106 | 2 | 2023-08-06 | 120.75 |
107 | 2 | 2023-08-07 | 85.40 |
Let’s look at an example of an UPDATE query being used to modify a record in the users table.
-- birthdate of username: susan_walker beforeSELECT username, birthdateFROM usersWHERE username = 'susan_walker';UPDATE usersSET birthdate = '1969-12-15'WHERE username = 'susan_walker';-- birthdate of username: susan_walker afterSELECT username, birthdateFROM usersWHERE username = 'susan_walker';
Here we are using a constant value (i.e. '1969-12-15') to change the birthdate of susan_walker.
What if we want to update a record using value(s) from another table in the database? This is where we would require an UPDATE from SELECT query. There are a couple of ways to do this.
SELECT subquery method
A subquery, also known as a nested query, is a query that is embedded into another query. Subqueries can be placed in different parts of the SQL statement depending on their purpose. For example, in the code below, we have used subquery in the SET clause.
Example
Another order was placed:
order_id | user_id | order_date | total_amount |
108 | 7 | 2023-08-08 | 50.65 |
The amount_spent attribute in the users table represents the sum of the total_payment of all orders for each user. Let's update and add the total_amount of this order to the amount_spent for user with ID number 7 in the users table.
-- amount_spent by id number 7 beforeSELECT id, amount_spentFROM usersWHERE id = 7;-- UPDATE from SELECTUPDATE usersSET amount_spent =(SELECT SUM(total_payment) FROM orders WHERE user_id = 7)WHERE id = 7;-- amount_spent by id number 7 afterSELECT id, amount_spentFROM usersWHERE id = 7;
There are also a few drawbacks to this method. The readability of the code will decrease as the complexity of the subquery increase, thus making it harder to keep track. Moreover, with large datasets, the execution time and space requirements increase significantly because the subquery is executed for each row being updated.
Join method
A Join is used to combine data from two or more tables based on a condition. Inner Join (which we will be using here) is a type of join that outputs only the rows that have matching values in both tables.
The tables are joined based on an attribute present in both tables. The attribute may have different names in both tables, but the values that it represents have to be the same.
In the code below, the joining condition we have used is the user identity number in both tables - id in users and user_id in orders.
Example
The amount_spent column for all the users is currently set to 0. We will use INNER JOIN to sum the total_amount from the orders table for each user and then update the amount_spent in the users table.
-- amount_spent user beforeselect username, amount_spentFROM users;-- UPDATE from SELECTUPDATE usersINNER JOIN (SELECT user_id, SUM(total_payment) AS total_sumFROM ordersGROUP BY user_id) AS order_totals ON users.id = order_totals.user_idSET users.amount_spent = order_totals.total_sum;-- amount_spent users afterselect username, amount_spentFROM users;
A drawback of this method is its inability to handle unmatched instances between the two tables. Based on the join condition, if the match is not found, no update will occur for that row. This can result in inconsistency if the query was meant to update all the records regardless of a match.
MERGE method
Also known as an “upsert”, the MERGE statement is the combination of the UPDATE and INSERT commands. It modifies existing records in the table and adds records if they do not exist. Based on the query, the MERGE synchronizes the records in the target table using the source table.
Example
Note: Since
MERGEstatement is not available in MySQL, we will be using Microsoft SQL Server.
We will be performing the same operation as in the previous example.
MERGE INTO users AS uUSING (SELECT user_id, SUM(total_payment) AS total_sumFROM ordersGROUP BY user_id) AS order_totals ON u.id = order_totals.user_idWHEN MATCHED THENUPDATE SET u.amount_spent = order_totals.total_sum;select * from users;
A drawback of the MERGE statement is that it is not supported on the majority of database systems available, thus limiting the choice of the user. Furthermore, as the size of the source and target tables increase, so does the time taken for its execution.
Conclusion
The UPDATE from SELECT statement enables data modifications in an SQL database table based on information retrieved from another table using SELECT. The following three methods were discussed to achieve this:
The first one uses an UPDATE statement with a SELECT subquery, which allows modifying data based on values from other tables.
The second approach involves an
INNER JOINto combine data for updates, offering better performance but reducing readability.The third and last method combines
UPDATEandINSERT, which provides a comprehensive way to synchronize data. However, it it essential to take into account its complexity and lack of support of the majority of database platforms.
Selecting the appropriate method to implement UPDATE from SELECT depends on the specific scenario's complexity, readability, and performance considerations.
Free Resources