Jul 8, 2011

To remove duplicate records from a table using SQL?

There may be quite a few possible ways of doing this, but the two most common ways are:-

Simpler and Faster approach - In this approach we simply create anothet table having only the distinct rows of the original table, drop the original table, and finally rename the new table to the original table. Voila! We're done. But, do remember that dropping a table will drop all the indexes and priviledges as well. So, you'll need to create them again.

CREATE TABLE NEW_TABLE AS SELECT DISTINCT * FROM ORIGINAL_TABLE;
DROP TABLE ORIGINAL_TABLE;
RENAME NEW_TABLE TO ORIGINAL_TABLE;

...create indexes/privileges on ORIGINAL_TABLE now...

The standard ROWID approach - it's the same approach where we simply compare the ROWID of the records having the same key values (duplicate records) and select only one of the duplicate rows - the one having either the min or the max ROWID. Don't worry, these ROWIDs are system generated and will never be duplicate, so you won't be having more than one min (or max).

DELETE FROM ORIGINAL_TABLE T1 WHERE ROWID > (SELECT MIN(ROWID) FROM ORIGINAL_TABLE T2WHERE T1.KEY = T2.KEY);

OR

DELETE FROM ORIGINAL_TABLE T1 WHERE ROWID < (SELECT MAX(ROWID) FROM ORIGINAL_TABLE T2WHERE T1.KEY = T2.KEY);

Here KEY represents the set of columns based on which we're deciding the duplicates.

No comments:

Post a Comment

Hi,

Thanks for your visit to this blog.
We would be happy with your Queries/Suggestions.