MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Merging two Files - Removing Duplicates
Goto page 1, 2  Next
 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
singhnarender79
Beginner


Joined: 24 Dec 2002
Posts: 32
Topics: 6
Location: U.K

PostPosted: Mon May 09, 2005 12:21 pm    Post subject: Merging two Files - Removing Duplicates Reply with quote

Hi Guys,

I have a requirement like this.

I have two Input files which have same kind of records (as u can see below), I want to merge these two files into one. There are some records which will be common in both and i want give priority to the records in File 2 and remove the same from File 1. The record from Type1 to the next Type1 forms one Record and the Record key is 12345, 67890, 12561 etc.

One more thing, can i do this using JCL (DFSORT) ?

Input 1 :

Type1 asdcaf12345
Type2
Type3
Type4
Type5
Type1 jdfvnd12561
Type2
Type3
Type4
Type5
Type6
Type7
Type1 asdfgg67890
Type2
Type3
Type4
Type5
Type6

Input 2:

Type1 asdfgg67890
Type2
Type3
Type4
Type5
Type6
Type1 asdcaf12345
Type2
Type3
Type4
Type5
Type1 jdfvnd98765
Type2
Type3
Type4

The Ouput File should be like this (Giving priority to records in File 2)

Output :

Type1 asdcaf12345
Type2
Type3
Type4
Type5
Type1 jdfvnd12561
Type2
Type3
Type4
Type5
Type6
Type7
Type1 asdfgg67890
Type2
Type3
Type4
Type5
Type6
Type1 jdfvnd98765
Type2
Type3
Type4

Thanks,
Naren
_________________
"Hold fast to dreams, for if dreams die, life is a broken winged bird that cannot fly."

-- Langston Hughes
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12376
Topics: 75
Location: San Jose

PostPosted: Mon May 09, 2005 12:29 pm    Post subject: Reply with quote

Naren,

A couple of questions. I assume that these are header and detail records. Is there a field on Detail records that shows they are linked to the particular header?

ex:
Code:

Type1 asdcaf12345
Type2
Type3
Type4
Type5


How can you tell that type2,3,4,5 belong to the header record asdcaf12345 ?

If you can differentiate then it is very easy.

Please post the lrecl,recfm and position of the key

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Mon May 09, 2005 2:16 pm    Post subject: Reply with quote

Naren,

Do the records actually have the strings 'Type1', 'Type2', etc in positions 1-5?
If not, what do the records really look like?
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
singhnarender79
Beginner


Joined: 24 Dec 2002
Posts: 32
Topics: 6
Location: U.K

PostPosted: Tue May 10, 2005 7:22 am    Post subject: Reply with quote

Hi Kolusu,

The LRECL is 90 and RECFM is VB. The position of the key record start from 25 and it's a combination of character and packed decimal format (total of 8 bytes). The Record Key struct look like

MP-POL-DEPT PIC S9 COMP-3.
MP-POL-LET PIC XX.
MP-POL-YEAR PIC S9(3) COMP-3.
MP-POL-SERIAL PIC S9(5) COMP-3.

The Type 1 is the header record and the rest are detail records. The type 2,3,4,5 belong to type1 until the next type1 record. I am afraid there is no linking between the type1 and type2,3,4,5 etc.

But there is one thing (See the actual records are below) ....

The first two bytes looks like this if using hex values

X'0101'SMOPHRT
X'0202'SMOPHNAM
X'0302'SMOPHADD
X'0302'SMOPHADD
X'0402'SMOPHOCC
X'0101'SMOPHRT
X'0202'SMOPHNAM
X'0302'SMOPHADD
X'0302'SMOPHADD
X'0302'SMOPHADD

As u can see if the record type is same, the hex values are also same and if the record types increase, so do the hex value like X'0502', X'0602' etc.


Frank,

The Records actually looks like this ...
(I had substitued them with type 1, type 2 etc ..for easy understanding)
Only the HRT contains the key and the rest are detail record (without the key).
The same repeats itself for a new record key.

..SMOPHRT (Root Record)
..SMOPHNAM (Name record)
..SMOPHADD (Address Record)
..SMOPHADD (Address Record)
..SMOPHOCC (Occupation Record)
..SMOPHRT
..SMOPHNAM
..SMOPHADD
..SMOPHADD
..SMOPHADD
..SMOPHADD
..SMOPHRT
..SMOPHNAM
..SMOPHADD
..SMOPHADD
..SMOPHADD
..SMOPHOCC
..SMOPHOAG

The First two bytes are the Hex values shown above.

Naren
_________________
"Hold fast to dreams, for if dreams die, life is a broken winged bird that cannot fly."

-- Langston Hughes
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Tue May 10, 2005 10:46 am    Post subject: Reply with quote

Naren,

Please run the following job and post the //SYSOUT messages (even if you get an error message):

Code:

//S1 EXEC PGM=ICEMAN
//SYSOUT DD SYSOUT=*
//SORTIN DD *
//SORTOUT DD DUMMY
//SYSIN DD *
   OPTION COPY
   INREC OVERLAY=(5:C'A')
/*

_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
singhnarender79
Beginner


Joined: 24 Dec 2002
Posts: 32
Topics: 6
Location: U.K

PostPosted: Tue May 10, 2005 11:13 am    Post subject: Reply with quote

Frank,

This is the what I got in the sysout...

[code:1:719e528dc6]
ICE143I 0 BLOCKSET COPY TECHNIQUE SELECTED
ICE000I 1 - CONTROL STATEMENTS FOR 5740-SM1, DFSORT REL 14.0 - 17:07 ON TUE MAY 10, 2005 -
OPTION COPY
INREC OVERLAY=(5:C'A')

_________________
"Hold fast to dreams, for if dreams die, life is a broken winged bird that cannot fly."

-- Langston Hughes
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Tue May 10, 2005 2:24 pm    Post subject: Reply with quote

Naren,

I think I could come up with a solution for this using DFSORT's new IFTHEN function. But the job I had you run shows me that you don't have the PTF with IFTHEN installed - DFSORT R14 PTF UQ95213 (Dec, 2004). So any solution I came up with along those lines wouldn't do you any good until you installed that PTF.
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Alain Benveniste
Beginner


Joined: 04 May 2003
Posts: 92
Topics: 4
Location: Paris, France

PostPosted: Wed May 11, 2005 4:36 am    Post subject: Reply with quote

Naren says
Code:

The record from Type1 to the next Type1 forms one Record

I'm not sure to well understand this.
I interpret this like if a group is present in file2 then the same group in file1 must be completely removed, then replaced by the one from file2; number of records from the 2 groups can differ.
If it's what you need, I made it 2 weeks ago to merge VM directories.
As Franks says, it's too tricky to elaborate something without the new DFSORT PTF installed.

Alain
Back to top
View user's profile Send private message
singhnarender79
Beginner


Joined: 24 Dec 2002
Posts: 32
Topics: 6
Location: U.K

PostPosted: Wed May 11, 2005 8:27 am    Post subject: Reply with quote

Frank/Alain,

In that case, let me talk with our system programmers here and if they are ready to install the new PTF then i will ask you for the solution.

Thanks,
Naren
_________________
"Hold fast to dreams, for if dreams die, life is a broken winged bird that cannot fly."

-- Langston Hughes
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
singhnarender79
Beginner


Joined: 24 Dec 2002
Posts: 32
Topics: 6
Location: U.K

PostPosted: Thu May 19, 2005 6:46 am    Post subject: Reply with quote

Hi Frank/Alain,

We have a confirmation from the System maintanence team that they are going to install the PTF UQ95213 on 29th May.

Could you please devise a solution based on the DFSORT new IFTHEN Function.

Thanks,
Naren
_________________
"Hold fast to dreams, for if dreams die, life is a broken winged bird that cannot fly."

-- Langston Hughes
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Fri May 20, 2005 12:02 pm    Post subject: Reply with quote

Naren,

In your example of the input and output records, the groups with a match have the same number of records in input file1 and input file2 and those records are identical. That is, the asdcaf12345 group has five records (type1-5) in both files and the asdfgg67890 has six records (type1-6) in both files. Do the matching groups always have the same number of identical records, or can they have different records or a different number of records? For example, could the 'key1' group have a group record, a name record and two address records in input file1 and a group record, a name record, three address records and an occupation record in input file2? If so, what would you want the output for that group to look like?

When you say "Giving priority to records in File 2", do you mean that for a matching group, you want to remove all of the input1 records, and keep all of the input2 records, or do you want to do something else? This goes along with my question above about whether the groups can have different records or a different number of records in the two input files.
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
singhnarender79
Beginner


Joined: 24 Dec 2002
Posts: 32
Topics: 6
Location: U.K

PostPosted: Fri May 20, 2005 12:46 pm    Post subject: Reply with quote

Frank,

It is not necessary that the matching groups will always have the same number of identical records as well as type of records. It can be different.

So when i say "Giving priority to records in File 2", for a matching group, I want to remove all of the input1 records, and keep all of the input2 records.

For e.g 'Key1' group in File1 can have 3 records (Root, name and address) and in File 2 it can have 5 records (Root, name, 2 address and occupation). So in my output file for key1 group 5 records need to be present.

Naren
_________________
"Hold fast to dreams, for if dreams die, life is a broken winged bird that cannot fly."

-- Langston Hughes
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Fri May 20, 2005 2:55 pm    Post subject: Reply with quote

Another question: For your output, you show the groups in the following key order:

asdcaf12345
jdfvnd12561
asdfgg67890
jdfvnd98765

Would it be ok to have the groups in their actual sorted key order, that is:

asdcaf12345
asdfgg67890
jdfvnd12561
jdfvnd98765

If not, what exactly are the rules for the order you want the keys sorted in?
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
singhnarender79
Beginner


Joined: 24 Dec 2002
Posts: 32
Topics: 6
Location: U.K

PostPosted: Mon May 23, 2005 3:42 am    Post subject: Reply with quote

Hi Frank,

If you see the 4th post from top, I have mentioned the key in the record ehich is

The position of the key record start from 25 and it's a combination of character and packed decimal format (total of 8 bytes). The Record Key struct look like

MP-POL-DEPT PIC S9 COMP-3.
MP-POL-LET PIC XX.
MP-POL-YEAR PIC S9(3) COMP-3.
MP-POL-SERIAL PIC S9(5) COMP-3.

and I want it to be sorted on this key.
_________________
"Hold fast to dreams, for if dreams die, life is a broken winged bird that cannot fly."

-- Langston Hughes
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Mon May 23, 2005 10:42 am    Post subject: Reply with quote

Naren,

Yes, I understood what the key was and where it was. I was asking about the order of the output records according to the key. As I said, your output did not show the records ordered according to the key, but I'll take your statement that "I want it to be sorted on this key" to mean that the output records should be ordered according to the key.

I've figured out conceptually how to do this. Now I just need to find some time to put the actual job together. Hopefully, I'll be able to post the solution sometime today.
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group