How To Delete Duplicate Rows in MySQL
[1]. Prepare sample data
The following script creates table contacts
and inserts sample data into the contacts
table for the demonstration.
DROP TABLE IF EXISTS contacts;
CREATE TABLE contacts (
id INT PRIMARY KEY AUTO_INCREMENT,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(255) NOT NULL
);
INSERT INTO contacts (first_name,last_name,email)
VALUES ('Carine ','Schmitt','carine.schmitt@verizon.net'),
('Jean','King','jean.king@me.com'),
('Peter','Ferguson','peter.ferguson@google.com'),
('Janine ','Labrune','janine.labrune@aol.com'),
('Jonas ','Bergulfsen','jonas.bergulfsen@mac.com'),
('Janine ','Labrune','janine.labrune@aol.com'),
('Susan','Nelson','susan.nelson@comcast.net'),
('Zbyszek ','Piestrzeniewicz','zbyszek.piestrzeniewicz@att.net'),
('Roland','Keitel','roland.keitel@yahoo.com'),
('Julie','Murphy','julie.murphy@yahoo.com'),
('Kwai','Lee','kwai.lee@google.com'),
('Jean','King','jean.king@me.com'),
('Susan','Nelson','susan.nelson@comcast.net'),
('Roland','Keitel','roland.keitel@yahoo.com');
Note that you can execute this script to recreate test data after you execute a DELETE
statement.
This query returns data from the contacts table:
SELECT * FROM contacts
ORDER BY email;

The following query returns the duplicate emails in the contacts
table:
SELECT
email, COUNT(email)
FROM
contacts
GROUP BY
email
HAVING
COUNT(email) > 1;

As you can see, we have four rows with duplicate emails.
[2]. Delete duplicate rows using DELETE JOIN
statement
MySQL provides you with the DELETE JOIN
statement that allows you to remove duplicate rows quickly.
The following statement deletes duplicate rows and keeps the highest id:
DELETE t1 FROM contacts t1
INNER JOIN contacts t2
WHERE
t1.id < t2.id AND
t1.email = t2.email;
This query references the contacts table twice, therefore, it uses the table alias t1 and t2.
The output is:
Query OK, 4 rows affected (0.10 sec)
It indicated that four rows have been deleted. You can execute the query that find duplicate emails again to verify the delete:
SELECT
email,
COUNT(email)
FROM
contacts
GROUP BY
email
HAVING
COUNT(email) > 1;
The query returns an empty set, which means that the duplicate rows have been deleted.
Let’s verify data from the contacts
table:
SELECT * FROM contacts;

The rows with id 2, 4, 7, and 9 have been deleted.
Comments
Post a Comment