Delete Duplicate Records using SQL

By duplicate record I mean that every field in one record is identical to every field in a different record, i.e. a duplicate is where there is no way of telling two or more records apart. If you just need to remove records which are similar (i.e. one or more fields are identical but there are one or more fields which are different) then instead refer to how to delete similar records.

To check that you have duplicate records in your table do the following:

select count(*) from MyTable

and

select distinct * from MyTable

unfortunately SQL does not permit a select count(distinct). You have duplicate records if the number of records returned by the second query is less than the number of records returned by the first.
Unfortunately there is no way in SQL to delete one of these duplicates without deleting all of them. They are identical after all, so there is no SQL query that you could put together which could distinguish between them.
What you can do is to copy all the distinct records into a new table:

select distinct *

into NewTable
from MyTable

This query will create a new table (NewTable in my example) containing all the records in the original table but without any records being duplicated. It will therefore preserve a single copy of those records which were duplicated.
Because this query creates (and populates) a new table, it will fail if the table already exists.

Leave a Reply

Your email address will not be published. Required fields are marked *