MVSFORUMS.com

karupps · Beginner Joined: 18 May 2005 Posts: 11 Topics: 3

I want to remove the dublicate records in the two files(if one record is present in both
the files dont write into output file).

All the files are Variable length record files. One record may have length of 70 and other record
may have length of 80,etc..

INFILE1
*******

kolusu · Posted: Wed May 18, 2005 5:27 am Post subject:

karupps,

What is the key to determine if the record is a duplicate? is it the entire record? (all the 70 bytes of infile1?)

Using VLEN as ON parm on SELECT parm will only eliminate dups which are of the same length. In your case infile1 and infile2 are of different lrecl, so you will never have a duplicate.

Kolusu

Ps : Please do NOT send emails seeking help. All questions are to be posted on helpboards only
_________________
Kolusu
www.linkedin.com/in/kolusu

karupps · Beginner Joined: 18 May 2005 Posts: 11 Topics: 3

Kolusu,

Thank you very much

In my case both the files are of Varibale length record. I want to remove the records of same record length.

In both files all the records have different record.

It is not possible to give ON(p,m,f) format, each record have diffrrent length.

For that only i tried vlen

If we are using VLEN , it should remove the records(duplicates) of same record length(whatever 67,68,69,70,...)

PLet me know still you are not getting

Thanks in advance

Karupps

Frank Yaeger · Posted: Wed May 18, 2005 10:25 am Post subject:

Karupps,

What is the LRECL of your input file?
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

kolusu · Posted: Wed May 18, 2005 11:01 am Post subject:

Frank,

I am guessing that all the records have full lrecl (VLTRIM off) , so OP was not able to eliminate the duplicate records on vlen.

I can only think of creating 2 temp files using VLTRIM on OUTFIL and concatenate these 2 temp files and then eliminate the dups on vlen

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu

Frank Yaeger · Posted: Wed May 18, 2005 1:20 pm Post subject:

Kolusu,

Using VLEN would not be a reliable way to remove duplicates as there isn't necessarily any correspondence between the length of the records and whether they are dups. Any number of records could have the same length.

Do you mean VLFILL rather than VLTRIM? VLFILL could be used to pad the short records out so they could be compared and then VLTRIM could be used to remove the fill characters. That's why I wanted to know the LRECL. I don't know what you mean by creating 2 temp files using VLTRIM - that would only remove a specific character at the end of the records - I don't see how that applies unless you used VLFILL first.
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

kolusu · Posted: Wed May 18, 2005 1:26 pm Post subject:

Frank,

I meant VLTRIM only. I am guessing that OP has the datasets with trailing spaces for all records. So Unless you remove the trailing spaces you don't get the exact the LRECL.

ie.

Frank Yaeger · Posted: Wed May 18, 2005 2:16 pm Post subject:

Kolusu,

Interesting assumption. But even if it's true that all of the records have trailing blanks and you remove them, how would just looking at the record length distinguish between:

AAA
AAA
BBB
CCC

All four will have a length of 7, but only the AAA records are duplicates. I don't see how just comparing the record lengths would ever give you an accurate check for duplicates?

My idea was to use VLFILL to pad out all the records to the LRECL with a character (e.g. X'FF') that doesn't appear in the data. Then you could compare the entire padded record to identify the duplicates. Then you could use VLTRIM to remove the pad character (X'FF').
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

kolusu · Posted: Wed May 18, 2005 3:15 pm Post subject:

Frank,

I assumed that OP does not care about the contents. He only needs unique LRECL records irrespective of the contents on the records.

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu

Frank Yaeger · Posted: Wed May 18, 2005 4:34 pm Post subject:

Kolusu,

Oh, I see. Well, now that I read back through the posts, it's certainly not clear whether he wants to remove the records of the same length, or the records with duplicate content. He seems to ask for both in different posts. However, he states several times that the records are of different lengths, so if that's the case, I would think that his original job would do it. Those statements would contradict the assumption that all of the records are the same length. But then his posts are full of contradictions, so who knows.

Karupps,

If you're still interested in a solution, please tell us whether you want to

(1) eliminate records with the same length and different content. For example:

karupps · Beginner Joined: 18 May 2005 Posts: 11 Topics: 3

Hi Kolusu & Frank,

Thank you very much for your all suggestionns.

I will tell my problem clearly:

My all the files are always VB.

EX. (LRECL - 256)

INFILE1:
00000000,20010912,00095044,4794A,HE,FRA,16,1618,C,20010912,1,10,5
00000000,20010913,00095044,4794A,HE,FRA,1,2146,C,20010913,1,,,
00000000,20010916,00095044,4794A,HE,FRA,16,1804,C,20010916,19,,,,

INFILE2:
00000000,20010912,00095044,4794A,HE,FRA,16,1618,C,20010912,1,10,5
00000000,20010913,00095044,4794A,HE,FRA,12,2146,C,20010913,12,5,,
00000000,20010916,00095044,4794A,HE,FRA,16,1804,C,20010916,19,,,,

In both the files first & third records are same , but there is diffrenece in second records.

My output file should be like this: (remove the first & third records)

OUTFILE

00000000,20010913,00095044,4794A,HE,FRA,1,2146,C,20010913,1,,,
00000000,20010913,00095044,4794A,HE,FRA,12,2146,C,20010913,12,5,,

I tried with ON(VLEN) options in SELECT but it is not selecting any records to OUTFILE.

Let me know still you are not getting my problem..

Karupps

Frank Yaeger · Posted: Thu May 19, 2005 10:38 am Post subject:

Karupps,

Assuming that you want to compare the entire record, here's a DFSORT/ICETOOL job that will do what you asked for:

karupps · Beginner Joined: 18 May 2005 Posts: 11 Topics: 3

Hi Frank & Kolusu,

Thank you very much for your help.

Now my problem solved

Thanks,
Karupps

Mervyn · Posted: Mon Sep 26, 2005 8:52 am Post subject:

I'm trying to use this code to check some larger files (LRECL 14834). Here's my JCL:

Phantom · Posted: Mon Sep 26, 2005 9:01 am Post subject:

Mervyn,

Your LRECL is the problem. There are upper limits for CH and BI. As far as I know syncsort does not support more than 4093 Chars for CH & BI. I think, this is the same with DFSORT. I am not sure if their latest version supports more than this.

Anyway, I don't think any sort products support 14,000 bytes at a stretch.

Hope this helps,

Thanks,
Phantom