An UPDATE
statement in SQL is to modify existing record(s) in a table. It allows us to change the values in one or more columns.
-- update tableUPDATE table_nameSET column1 = value1, column2 = value2, ...WHERE condition;
Note: We will be using the following two tables to demonstrate all the examples below.
ID | Username | Birthdate | Amount_spent | |
1 | 'john_doe' | john@example.com | 1990-05-15 | 225.80 |
2 | 'jane_smith' | jane@example.com | 1998-10-22 | 341.25 |
3 | 'alex_brown' | alex@example.com | 1995-02-08 | 0 |
4 | 'emily_watson' | emily@example.com | 1992-07-30 | 0 |
5 | 'micheal_jackson' | micheal@example.com | 1985-08-29 | 0 |
6 | 'susan_walker' | susan@example.com | NULL | 0 |
7 | 'lisa_jones' | lisa@example.com | 1998-03-17 | 0 |
order_id | user_id | order_date | total_payment |
101 | 1 | 2023-08-01 | 150.00 |
102 | 2 | 2023-08-02 | 220.50 |
103 | 1 | 2023-08-03 | 75.80 |
104 | 3 | 2023-08-04 | 350.00 |
105 | 4 | 2023-08-05 | 45.25 |
106 | 2 | 2023-08-06 | 120.75 |
107 | 2 | 2023-08-07 | 85.40 |
Let’s look at an example of an UPDATE
query being used to modify a record in the users
table.
-- birthdate of username: susan_walker beforeSELECT username, birthdateFROM usersWHERE username = 'susan_walker';UPDATE usersSET birthdate = '1969-12-15'WHERE username = 'susan_walker';-- birthdate of username: susan_walker afterSELECT username, birthdateFROM usersWHERE username = 'susan_walker';
Here we are using a constant value (i.e. '1969-12-15'
) to change the birthdate
of susan_walker
.
What if we want to update a record using value(s) from another table in the database? This is where we would require an UPDATE from SELECT
query. There are a couple of ways to do this.
A subquery, also known as a nested query, is a query that is embedded into another query. Subqueries can be placed in different parts of the SQL statement depending on their purpose. For example, in the code below, we have used subquery in the SET
clause.
Another order was placed:
order_id | user_id | order_date | total_amount |
108 | 7 | 2023-08-08 | 50.65 |
The amount_spent
attribute in the users
table represents the sum of the total_payment
of all orders for each user. Let's update and add the total_amount
of this order to the amount_spent
for user with ID number 7 in the users
table.
-- amount_spent by id number 7 beforeSELECT id, amount_spentFROM usersWHERE id = 7;-- UPDATE from SELECTUPDATE usersSET amount_spent =(SELECT SUM(total_payment) FROM orders WHERE user_id = 7)WHERE id = 7;-- amount_spent by id number 7 afterSELECT id, amount_spentFROM usersWHERE id = 7;
There are also a few drawbacks to this method. The readability of the code will decrease as the complexity of the subquery increase, thus making it harder to keep track. Moreover, with large datasets, the execution time and space requirements increase significantly because the subquery is executed for each row being updated.
A Join is used to combine data from two or more tables based on a condition. Inner Join (which we will be using here) is a type of join that outputs only the rows that have matching values in both tables.
The tables are joined based on an attribute present in both tables. The attribute may have different names in both tables, but the values that it represents have to be the same.
In the code below, the joining condition we have used is the user identity number in both tables - id
in users
and user_id
in orders
.
The amount_spent
column for all the users is currently set to 0. We will use INNER JOIN
to sum the total_amount
from the orders
table for each user and then update the amount_spent
in the users
table.
-- amount_spent user beforeselect username, amount_spentFROM users;-- UPDATE from SELECTUPDATE usersINNER JOIN (SELECT user_id, SUM(total_payment) AS total_sumFROM ordersGROUP BY user_id) AS order_totals ON users.id = order_totals.user_idSET users.amount_spent = order_totals.total_sum;-- amount_spent users afterselect username, amount_spentFROM users;
A drawback of this method is its inability to handle unmatched instances between the two tables. Based on the join condition, if the match is not found, no update will occur for that row. This can result in inconsistency if the query was meant to update all the records regardless of a match.
Also known as an “upsert”, the MERGE statement is the combination of the UPDATE
and INSERT
commands. It modifies existing records in the table and adds records if they do not exist. Based on the query, the MERGE
synchronizes the records in the target table using the source table.
Note: Since
MERGE
statement is not available in MySQL, we will be using Microsoft SQL Server.
We will be performing the same operation as in the previous example.
MERGE INTO users AS uUSING (SELECT user_id, SUM(total_payment) AS total_sumFROM ordersGROUP BY user_id) AS order_totals ON u.id = order_totals.user_idWHEN MATCHED THENUPDATE SET u.amount_spent = order_totals.total_sum;select * from users;
A drawback of the MERGE
statement is that it is not supported on the majority of database systems available, thus limiting the choice of the user. Furthermore, as the size of the source and target tables increase, so does the time taken for its execution.
The UPDATE from SELECT
statement enables data modifications in an SQL database table based on information retrieved from another table using SELECT
. The following three methods were discussed to achieve this:
The first one uses an UPDATE statement with a SELECT subquery, which allows modifying data based on values from other tables.
The second approach involves an INNER JOIN
to combine data for updates, offering better performance but reducing readability.
The third and last method combines UPDATE
and INSERT
, which provides a comprehensive way to synchronize data. However, it it essential to take into account its complexity and lack of support of the majority of database platforms.
Selecting the appropriate method to implement UPDATE from SELECT
depends on the specific scenario's complexity, readability, and performance considerations.
Free Resources