MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Handling Duplicates using Syncsort

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
raj_m
Beginner


Joined: 24 May 2005
Posts: 5
Topics: 2

PostPosted: Mon Sep 12, 2005 11:06 am    Post subject: Handling Duplicates using Syncsort Reply with quote

I have two file that needs to be compared against to get the desired output using Syncsort
Code:

Case 1
Input file:
key1 key2 key3
111  001  123
111  001  123
111  001  123
111  001  999
222  002  234
222  002  345
333  003  123

My desired output
222  002  234
222  002  345
333  003  123

Case 2
Input file:
key1 key2 key3
111  001  123
111  001  123
111  001  123
222  002  234
222  002  345
333  003  123

My desired output
111  001  123
222  002  234
222  002  345
333  003  123


in brief what i should be doing is

* SORT based on KEY1, KEY2, KEY3 and eliminate duplicates
* THEN Sort based on KEY1 and KEY2 again and
IF FOUND DUPLICATES eliminate all records with matching KEY1 and KEY2 entries
Else Retain the unique entry.

This is the master file and I need to do similar processing for transaction file and using these two files i need to split using SPLICE
A. Match in File A & File B
B. In File A but not in File B
C. In File B but not in File A

I would be able to do it in multiple sort steps, but wanted to find out if this can be done in an efficient way?

Any help would be greatly appreciated

Thanx in Advance
Raj
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12369
Topics: 75
Location: San Jose

PostPosted: Mon Sep 12, 2005 11:21 am    Post subject: Reply with quote

Quote:

* SORT based on KEY1, KEY2, KEY3 and eliminate duplicates
* THEN Sort based on KEY1 and KEY2 again and
IF FOUND DUPLICATES eliminate all records with matching KEY1 and KEY2 entries
Else Retain the unique entry.

Raj_m,

hmm why do you need to sort first on key1, key2 and key3 and then second sort just key1 and key2?

You can achieve that just by sorting on key1 and key2 and eliminate dups
Code:

//STEP0100 EXEC PGM=SORT                                   
//SYSOUT   DD SYSOUT=*                                     
//SORTIN   DD *                                             
111  001  123                                               
111  001  123                                               
111  001  123                                               
111  001  999                                               
222  002  234                                               
222  002  345                                               
333  003  123                                               
//SORTOUT  DD SYSOUT=*                                     
//SYSIN    DD *                                             
  OPTION EQUALS              $ ENSURES TO PICK THE FIRST REC
  SORT FIELDS=(1,3,CH,A,     $ SORT ON KEY1                 
               6,3,CH,A)     $ SORT ON KEY2                 
  SUM FIELDS=NONE            $ ELIMINATE DUPS               
/*                                                         


The output from the above step is:
Code:

111  001  123
222  002  234
333  003  123


Quote:

This is the master file and I need to do similar processing for transaction file and using these two files i need to split using SPLICE
A. Match in File A & File B
B. In File A but not in File B
C. In File B but not in File A


Did you check this topic? See the solution posted by me

http://www.mvsforums.com/helpboards/viewtopic.php?t=4723&highlight=joinkeys

Hope this helps...

Cheers
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
raj_m
Beginner


Joined: 24 May 2005
Posts: 5
Topics: 2

PostPosted: Mon Sep 12, 2005 11:52 am    Post subject: Reply with quote

Kolusu,

That does not give me the required output for CASE1. That's probably the reason why i need a sort on KEY1 KEY2 and KEY3 first.
Let me explain

Code:

Case 1
Input file:
key1 key2 key3
111  001  123
111  001  123
111  001  123
111  001  999
222  002  234
222  002  345
333  003  123

My desired output
222  002  234
222  002  345
333  003  123


Sorting based on KEY1 KEY2 KEY3 would give me
Code:

111  001  123 
111  001  999
222  002  234
222  002  345
333  003  123


NOW record 1 and 2 are duplicates if we consider KEY1 and KEY2. IF i find duplicated I need to eliminate all the duplicated record entries(NOT UNIQUE OUT). in this case record1 and record 2 should not be in the output

Code:

My desired output
222  002  234
222  002  345
333  003  123


For case2 since the record1 is unique based on KEY1 and KEY2 I need to retain this in the output.

Code:

111  001  123
222  002  234
222  002  345
333  003  123



Thanks
Raj
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12369
Topics: 75
Location: San Jose

PostPosted: Mon Sep 12, 2005 12:02 pm    Post subject: Reply with quote

Quote:

Sorting based on KEY1 KEY2 KEY3 would give me
Code:

111 001 123
111 001 999
222 002 234
222 002 345
333 003 123



NOW record 1 and 2 are duplicates if we consider KEY1 and KEY2. IF i find duplicated I need to eliminate all the duplicated record entries(NOT UNIQUE OUT). in this case record1 and record 2 should not be in the output

Code:

My desired output
222 002 234
222 002 345
333 003 123



Raj_m,

Something still does not seem right , in your final output why did you pick the
Code:
222 002
record? It is also a a duplicate considering key1 and key2. You eliminated
Code:

111  001  123 
111  001  999


based on the codition that they are duplicates on key1 and key2 . why doesn't the same rule apply to 222 record?

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
raj_m
Beginner


Joined: 24 May 2005
Posts: 5
Topics: 2

PostPosted: Mon Sep 12, 2005 12:07 pm    Post subject: Reply with quote

Im sorry for the confusionKolusu.. you are right. even that need to be eliminated.
I was just keying in some dummy values..missed out on that.
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12369
Topics: 75
Location: San Jose

PostPosted: Mon Sep 12, 2005 12:22 pm    Post subject: Reply with quote

Raj_m,

If your intention is to eliminate all dups then check this link

http://www.mvsforums.com/helpboards/viewtopic.php?t=8&highlight=nodups

Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group