MVSFORUMS.com

vini · Posted: Fri Jan 30, 2004 9:05 am Post subject: Delete Duplicates

Hi ,

Can anyone tell me what will be the SQL like to Delete alll but One Duplicate Records in a Table ? If not SQL , then is there any other method to achieve the same ?

If following is the Data in a Table for Col1, Col2, Col3
x, y, z
x ,y, z
x, y, z
a, b, c
a, b, c
l, m, n

After the Delete it should be
x, y, z
a, b, c
l, m, n

Actually a friend asked me this one ..

Thnks.
Vini.

kolusu · Posted: Fri Jan 30, 2004 9:50 am Post subject:

Vini,

Deleting the duplicates retaining the first record is not that easy with sql. However there are many other ways to delete the rows.

1. Unload the table and run a utility(SORT) to remove duplicate records and reload it again
2. code a cobol pgm which will open cursor and perform the logic of determing the duplicate and delete the record.
3. Generate sql statements using easytrieve or any other utitlity to delete the duplicate records.

Hope this helps...

Cheers,

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu

kolusu · Posted: Fri Jan 30, 2004 10:17 am Post subject:

Vini,

Try this sql

vini · Posted: Fri Jan 30, 2004 11:29 am Post subject:

kolusu ,
I think that SQL will eliminate allll duplicates and only leave those Records which did not have any duplicates ..
i.e it would leave
l,m,n

vini.

kolusu · Posted: Mon Feb 02, 2004 6:49 am Post subject:

vini,

In your first post you said

vini · Posted: Mon Feb 02, 2004 5:16 pm Post subject:

kolusu,

I wanted to say that I think "l,m,n" would be the result of the Query as suggested by you. The output I want is still as the first post.

However , I could not run that Query for real as I could not find a Table I could easily create some test data in without affecting the other developers testing or involving the dba. On second thoughts I think I can use the DEPT sample db2 table Idea

. Most probably noone should have problem with that.

That must work if you have run it at your end. Its just that when I pondered on it .. I could not figure out how that could get the desired result Confused

because
HAVING COUNT(*) > 1
should return all Rows which are Duplicates and the Delete would then eliminate all of these... that would leave behind only those rows which originally had no duplciates.

Thnks
Vini

kolusu · Posted: Tue Feb 03, 2004 6:48 am Post subject:

vini,

The query I posted earlier will give you the following results.

vini · Posted: Tue Feb 03, 2004 4:35 pm Post subject:

kolusu,

Here developers dont have authority to create tables ..besides even DEPT did not help got -536 .

Finally took a small sized infrequently used table in Test region and both the Queries provided by you worked PERFECT !!!

I still have to figure them out though.. for my own understanding i.e. Smile

Thnks
vini

danm · Intermediate Joined: 29 Jun 2004 Posts: 170 Topics: 73

Kolusu,

Your first query did give the correct result. But I need some explanation on how it works.

cp · Beginner Joined: 21 Oct 2005 Posts: 6 Topics: 1

Hi Kolusu,

I have tested your query given in Post-3(You have mentioned here that the query will keep one row for each group of duplicates).But it is deleting all the duplicate rows.
can u guide us to retain one row for each group of duplicates.....

chandrankk · Beginner Joined: 06 Dec 2005 Posts: 8 Topics: 0

I am just thinking why a simple DISTINCT will not do in this case

Ofcourse, it will not work if you are considering only a partial set as the key

chandrankk · Beginner Joined: 06 Dec 2005 Posts: 8 Topics: 0

Sorry, I understood the question wrongly.
I tested the query by Kolusu and am getting the same result cp is getting. That is, it is deleting all the duplicate rows. I don't know what is wrong here.

a_seshu

Hi Kolusu,

The reason i think all the duplicate rows get deleted even in the case of a correlated query is that the delete operation performs at a set level.

Taking the same data example

x, y, z
x, y, z
x, y, z
a, b, c
a, b, c
l, m, n

The query you posted in the post 3