MVSFORUMS.com

rasprasads

How to eliminate duplicates when using OUTFIL statement in SYNCSORT.

My SYSIN would be like:

Mike Tebb · Posted: Fri Dec 13, 2002 3:04 am Post subject:

SUM FIELDS=NONE will remove all records where the sort fields are duplicated.
_________________
Cheers - Mike

rasprasads · Posted: Fri Dec 13, 2002 3:48 am Post subject:

Mike,
Thanks for your response but what i wanted to know was how to eliminate the duplicates when using OUTFIL.

I have tried

kolusu · Posted: Fri Dec 13, 2002 5:35 am Post subject:

Rasprasad,

The following jcl will give you the desired results. you cannot code sum fields statement on outfil. so initially we copy the out2 file to a temp file and then sort it to remove the duplicates.

Mike Tebb · Posted: Fri Dec 13, 2002 6:24 am Post subject:

Kolusu,

I had come up with the following:

kolusu · Posted: Fri Dec 13, 2002 6:38 am Post subject:

Mike,

Techinically speaking your solution is okay. But efficiency wise it is not. The second pass is doing a sum sort on the entire file instead of the specific records.

Let us assume that the input file have 1 million records and 800,000 records are having '10' in postion 35 and only 200,000 records are having '20' in position 35.Out of these 200,000 only 50,000 are duplicates.

So if we split the file initially into 2 different files and then sumsort on the smaller files.

Your solution will sum sort on the entire file and it affects the performance.

For relatively small files you wont even notice the difference.

Even in my posted solution , the first step can be a COPY instead of sort. But I assumed rasprasad wanted to have both files sorted on field at 509.

I hope I explained it clearly.Let me know if you have any questions

Thanks

Kolusu

Mike Tebb · Posted: Fri Dec 13, 2002 6:40 am Post subject:

That is exactly what I was guessing would happen.

Thanks (as ever) for your advice.
_________________
Cheers - Mike

kolusu · Posted: Fri Dec 13, 2002 6:55 am Post subject:

Mike,

Well this another version to implement your logic. we first sumsort the entire file to eliminate the duplicates, but we will be writting the duplicates to SORTXSUM file.The XSUM parameter of the SUM statement sends all records deleted to the //SORTXSUM DD.

The second step takes in the SORTXSUM file and includes only '10' records and appends the data to out1 file

Mike Tebb · Posted: Fri Dec 13, 2002 9:20 am Post subject:

For what it's worth my first thought had been to use two sort steps.
My first step would output all the type 010 and 020 records to the relevant file:

kolusu · Posted: Fri Dec 13, 2002 9:38 am Post subject:

Mike,

Your idea of 2 sorts is better than my appending data idea using xsum feature.May be my brain did not start working at 6 am in the morning Sad

Kolusu

Frank Yaeger · Posted: Fri Dec 13, 2002 11:14 am Post subject:

Mike said "It should be noted for DFSORT users that XSUM is only available in SYNCSORT, as I have discovered when trying to 'advise' users of that product".

Mike,

Although DFSORT does not support XSUM, it does support the same function (and more) through ICETOOL's SELECT with DISCARD feature (which Syncsort does not support). For more information on that, see the "Keep dropped duplicate records (XSUM)" Smart DFSORT Trick at:

http://www.ibm.com/servers/storage/support/software/sort/mvs/tricks/

That should help you advise users of DFSORT/ICETOOL correctly.

Note that the DFSORT and ICETOOL documentation are freely available on the Web for reference at:

http://www.ibm.com/servers/storage/support/software/sort/mvs/srtmpub.html
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

Mike Tebb · Posted: Fri Dec 13, 2002 11:27 am Post subject:

Frank,

rest assured that I was merely making the point that XSUM is a SYNCSORT only statement, in the context of the solutions given in this (SYNCSORT) thread.

I am certainly not qualified to talk about the alternative options in DFSORT as I do not have access to your product.
_________________
Cheers - Mike

Frank Yaeger · Posted: Fri Dec 13, 2002 1:45 pm Post subject:

Mike,

I guess I misinterpreted what you said.

At any rate, even if your site doesn't have access to a DFSORT license, you still have access to the online DFSORT books in case you're ever curious about DFSORT/ICETOOL/ICEGENER.

And if anybody is interested in the DFSORT Team's analysis of DFSORT's advantages, contact me offline (yaeger@us.ibm.com) and I'll send you a document on that.
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort