How to remove duplicate rows from an SQL table
Overview
In SQL, we can follow the following two steps to remove duplicate values:
- Group all the rows using the target column(s).
- Find all the row IDs using the
MIN()function. This will give us a list of all the row IDs. In case of duplicate rows, only the ID of the first instance will be taken. - Delete all the rows that are not present in the list returned in step 2.
Example
Let's create a table and insert some values into it.
/* Create table Student */CREATE TABLE Student(ID INT NOT NULL,NAME VARCHAR (20) NOT NULL,AGE INT NOT NULL,PRIMARY KEY (ID));/* Insert values in table Student */INSERT INTO Student (ID, NAME, AGE)VALUES (1, 'Shubh', 22);INSERT INTO Student (ID, NAME, AGE)VALUES (2, 'Parth', 21);INSERT INTO Student (ID, NAME, AGE)VALUES (3, 'Shubh', 22);INSERT INTO Student (ID, NAME, AGE)VALUES (4, 'Pratik', 23);/* Display table data */SELECT * FROM Student;
Explanation
- Lines 2 to 7: We create a table,
Student. - Lines 10 to 20: We insert a few values in the
Studenttable. - Line 23: We display the data present in the
Studenttable.
Now let's find and delete all the duplicate rows from the Student table.
/* Create table Student */CREATE TABLE Student(ID INT NOT NULL,NAME VARCHAR (20) NOT NULL,AGE INT NOT NULL,PRIMARY KEY (ID));/* Insert values in table Student */INSERT INTO Student (ID, NAME, AGE)VALUES (1, 'Shubh', 22);INSERT INTO Student (ID, NAME, AGE)VALUES (2, 'Parth', 21);INSERT INTO Student (ID, NAME, AGE)VALUES (3, 'Shubh', 22);INSERT INTO Student (ID, NAME, AGE)VALUES (4, 'Pratik', 23);/* Removing duplicate rows from 'Student' table */DELETE FROM Student WHERE ID NOT IN (SELECT ID FROM (SELECT MIN(ID) AS ID FROM StudentGROUP BY NAME, AGE) AS S);/* Display table data */SELECT * FROM Student;
Explanation
- Lines 2 to 7: We create a table,
Student. - Lines 10 to 20: We insert a few values in the
Studenttable. - Lines 23 to 28: We find a list of all the row IDs in the
Studenttable using theMIN()function and delete those rows that are not present in this list. - Line 31: We display the data present in the
Studenttable.