Posted: Tue Feb 21, 2006 5:09 am Post subject: compare several columns in 2 files and write out difference
I need to compare 2 files (both LRECL=80 and FB for this example)
and check the values in file1 with file2. The key is in postion 15
This check has to be done for all columns starting with postion 45
I will have about 45 columns. For convenience sake I have just shown
a few.
My idea was to use SELECT FROM ... ON ... NODUPS. Since SELECT only
permits 10 ON's, this wont work for me.
Is there a better way of doing this?. Any change in any columns
for a key,starting position 45,should cause the record in
file1 with reversed amount written to file3 and file2 record
also written to file3. My problem now is how can I check
several columns and write out the result.
2950.00 1111 20050701-20050930 OPEN TECH HOST
950.00 2222 20050701-20050930 OPEN TECH PC
150.00 3333 20040701-20040930 BOOK PAYM AS400
file2
2950.00 1111 20060101-20060101 OPEN TECH HOST
950.00 2222 20050701-20050930 BOOK TECH PC
150.00 3333 20040701-20040930 BOOK PAYM UNIX
EXPECTED
------------
-950.00 2222 20050701-20050930 OPEN TECH PC
950.00 2222 20050701-20050930 BOOK TECH PC
-150.00 3333 20040701-20040930 BOOK PAYM AS400
150.00 3333 20040701-20040930 BOOK PAYM UNIX
1. The same files will not have duplicates, if any duplicates found
across files, they can be discarded.
2. If file1 record not found in file2, discard this record and if file2 record
not found in file1, this should appear in out file.
3 I have Z/OS DFSORT V1R5
On second thoughts, is it possible to have two output files one with
file1 records and the second with file2 records based on the following
criteria.
If file1 keys not in file2, discard file1 record
If file2 keys not in file1, write file2 records to OUT2
If file2 keys found in file1 and any column values starting with position
45 are different, then file1 record in OUT1 and file2 record to OUT2
If file2 keys found in file1 and all the col. values are the same starting
with position 45, then discard file1 and file2 records
Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
Posted: Tue Feb 21, 2006 11:37 am Post subject:
Are you asking now how to do your "second thoughts" requirement? If so, please show an example of the records in each input file and the records you expect for output, illustrating each of your conditions.
I don't understand your problem with the limit of 10 ON fields. A CH ON field can have up to 1500 bytes. Do you really have more than 10 separate fields as the key, or are you confusing the number of columns with the number of keys? _________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
2950.00 1111 20050701-20050930 OPEN TECH HOST 1
950.00 2222 20050701-20050930 OPEN TECH PC 2
150.00 3333 20040701-20040930 BOOK WIRE AS400 3
100.00 8888 20050701-20050930 SETTL TECH PC 5
File2
Code:
----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
2950.00 1111 20060101-20060101 OPEN TECH HOST 1
950.00 2222 20050701-20050930 BOOK TECH PC 2
150.00 3333 20040701-20040930 BOOK PAYM UNIX 3
350.00 6666 20040701-20040930 BOOK PAYM UNIX 4
Rule 1 duplicate in file 1 and 2 starting with col 45(discard from both out files )
Rule 2 pos 45-48 different in file 1 and file2, out1 with file1 and out2 with file2 record
Rule 3 pos 45-48 and 65-69 different in both files, out1 with file1 and out2 with file2 record
Rule 4 if File2 record not in file1, file2 record to Out2
Rule 5 if file1 record not in file2 discard from out1
Code:
Out1
------
950.00 2222 20050701-20050930 OPEN TECH PC 2
150.00 3333 20040701-20040930 BOOK WIRE AS400 3
Out2
------
950.00 2222 20050701-20050930 BOOK TECH PC 2
150.00 3333 20040701-20040930 BOOK PAYM UNIX 3
350.00 6666 20040701-20040930 BOOK PAYM UNIX 4
I tried to achieve this result, in the following manner but with different
col. position as in my real situation and got the following error. My idea was to check all cols in both files and keep no duplicates. I have about 40 to 45 possible columns to be verified. I am surely doing something wrong here. What is the best way to compare several columns in 2 files?
SELECT FROM(IN1) TO(OUT) -
ON(164,32,CH) -
ON(230,32,CH) -
ON(296,32,CH) -
ON(362,32,CH) -
ON(428,32,CH) -
ON(428,32,CH) -
ON(494,32,CH) -
ON(494,32,CH) -
ON(560,32,CH) -
ON(626,32,CH) -
ON(692,32,CH) -
$
MAXIMUM NUMBER OF ON KEYWORDS EXCEEDED
NODUPS
Joined: 26 Nov 2002 Posts: 12376 Topics: 75 Location: San Jose
Posted: Tue Feb 21, 2006 2:52 pm Post subject:
Shuko,
Since the fields you are checking aren't conscuetive , you need to use an inrec/outrec and put all the fields you want to check at the end of every record and then use the select statement on the new padded fields.
let us say you have a 80 byte file and you want to check the 2 byte fileds at pos 20, 24, 28, ...
Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
Posted: Tue Feb 21, 2006 4:18 pm Post subject:
Shuko,
The way I approach this kind of problem is to splice each pair of records together and then use INCLUDE logic to determine which record goes to which file according to the "rules" you need. Here's a DFSORT job to do that based on your example and my interpretation of your stated rules. Although I got the output you asked for, I may not have the INCLUDE conditions exactly the way you want them for your rules, but you can adjust them as needed.
Code:
//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN1 DD *
2950.00 1111 20050701-20050930 OPEN TECH HOST
950.00 2222 20050701-20050930 OPEN TECH PC
150.00 3333 20040701-20040930 BOOK WIRE AS400
100.00 8888 20050701-20050930 SETTL TECH PC
//IN2 DD *
2950.00 1111 20060101-20060101 OPEN TECH HOST
950.00 2222 20050701-20050930 BOOK TECH PC
150.00 3333 20040701-20040930 BOOK PAYM UNIX
350.00 6666 20040701-20040930 BOOK PAYM UNIX
/*
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//OUT1 DD SYSOUT=*
//OUT2 DD SYSOUT=*
//TOOLIN DD *
* Reformat IN1 records to:
* IN1 key/data|11|blanks..|
COPY FROM(IN1) TO(T1) USING(CTL1)
* Reformat IN2 records to:
* IN2 key/data|22|IN2 data|
COPY FROM(IN2) TO(T1) USING(CTL2)
* SPLICE records on key to get:
* IN1 key/data|id|IN2 data|
* id = 21 if key is in file1 and file2
* 11 if key is in file1 only
* 22 if key is in file2 only
SPLICE FROM(T1) TO(OUT1) ON(15,4,CH) WITH(81,1) WITH(85,36) -
KEEPNODUPS USING(CTL3)
/*
//CTL1CNTL DD *
OUTREC FIELDS=(1,80,81:C'11',165:X)
//CTL2CNTL DD *
OUTREC FIELDS=(1,80,81:C'22',85:45,36)
//CTL3CNTL DD *
* If id = 21 (record in both files) and 45-80 from file1 does
* not match 45-80 from file2, write record to file1.
OUTFIL FNAMES=OUT1,
INCLUDE=(81,2,CH,EQ,C'21',AND,45,35,CH,NE,85,35,CH),
* Remove IN2 key/data
OUTREC=(1,80)
* If id = 22 (record in file2 only) or
* id=21 (record in both files) and 45-80 from file1 does
* not match 45-80 from file2, write record to file2.
OUTFIL FNAMES=OUT2,
INCLUDE=(81,2,CH,EQ,C'22',OR,
(81,2,CH,EQ,C'21',AND,45,35,CH,NE,85,35,CH)),
OUTREC=(1,80)
/*
OUT1 has:
Code:
950.00 2222 20050701-20050930 OPEN TECH PC
150.00 3333 20040701-20040930 BOOK WIRE AS400
OUT2 has:
Code:
950.00 2222 20050701-20050930 OPEN TECH PC
150.00 3333 20040701-20040930 BOOK WIRE AS400
350.00 6666 20040701-20040930 BOOK PAYM UNIX
_________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Thank you
I tried both the codes, Kolusu code in statement ON(81,40,CH) returned an error "FIELD BEYOND MAXIMUM RECORD LENGTH."
CTL2CNTL was declared in the JCL but not used, is there a statement missing?. Frank your code gave me the expected
result,except that I had the record with key 1111 in both the OUT files.
Joined: 26 Nov 2002 Posts: 12376 Topics: 75 Location: San Jose
Posted: Wed Feb 22, 2006 10:44 am Post subject:
Quote:
I tried both the codes, Kolusu code in statement ON(81,40,CH) returned an error "FIELD BEYOND MAXIMUM RECORD LENGTH."
Shuko,
That was just an example to show that I have 40 bytes of fields to validate. So you need to change to number of bytes you have padded at the end.
Quote:
CTL2CNTL was declared in the JCL but not used, is there a statement missing?. Frank your code gave me the expected
result,except that I had the record with key 1111 in both the OUT files.
yes it was a typo, it should be on the SELECT statement
Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
Posted: Wed Feb 22, 2006 1:12 pm Post subject:
Quote:
Frank your code gave me the expected
result,except that I had the record with key 1111 in both the OUT files.
Then you changed something or your input is not what you showed (wrong columns).
If you look at the OUTFIL statements, you'll see that I have this condition for both:
81,2,CH,EQ,C'21',AND,45,35,CH,NE,85,35,CH
The spliced 1111 record would have '21' in 81,2 but 45,35 and 85,35 would be equal, so we wouldn't include it.
Note that I didn't get the 1111 records in the output files when I ran the job with the SORTIN data shown.
To see what's going on with your version, temporarily remove USING(CTL3) from the SPLICE operator. OUT1 will then show you the spliced records before the INCLUDE conditions are applied, so you can see what's in 81,2 and 45,35 and 85,35 for each record. I suspect you just have data in the wrong columns relative to what you actually need. _________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Frank,
I tried to implement your working example, for my actual situation.
LRECL=8000, 3 keys and check to be done from position 362 in both files.
I get the error ICE221A 1 INVALID FIELD OR CONSTANT IN OUT1 IFTHEN 0 CONDITION 2. Obviously I have an error in the second realational condition in OUT1 and cannot figure out the reason.
The code
Code:
* Reformat IN1 records to:
* IN1 key/data|11|blanks..|
COPY FROM(IN1) TO(T1) USING(CTL1)
* Reformat IN2 records to:
* IN2 key/data|22|IN2 data|
COPY FROM(IN2) TO(T1) USING(CTL2)
* SPLICE records on key to get:
* IN1 key/data|id|IN2 data|
* id = 21 if key is in file1 and file2
* 11 if key is in file1 only
* 22 if key is in file2 only
SPLICE FROM(T1) TO(OUT1) -
ON(98,32,CH) -
ON(230,32,CH) -
ON(296,32,CH) -
WITH(8001,1) WITH(8005,7639) -
KEEPNODUPS USING(CTL3)
/*
//CTL1CNTL DD *
OUTREC FIELDS=(1,8000,8001:C'11',16005:X)
//CTL2CNTL DD *
OUTREC FIELDS=(1,8000,8001:C'22',8005:362,7639)
//CTL3CNTL DD *
* If id = 21 (record in both files) and 362-8000 from file1 does
* not match 362-8000 from file2, write record to file1.
OUTFIL FNAMES=OUT1,
INCLUDE=(8001,2,CH,EQ,C'21',AND,362,7638,CH,NE,8005,7638,CH),
* Remove IN2 key/data
OUTREC=(1,8000)
* If id = 22 (record in file2 only) or
* id=21 (record in both files) and 362-8000 from file1 does
* not match 362-8000 from file2, write record to file2.
OUTFIL FNAMES=OUT2,
INCLUDE=(8001,2,CH,EQ,C'22',OR,
(8001,2,CH,EQ,C'21',AND,362,7638,CH,NE,8005,7638,CH)),
OUTREC=(1,8000)
/*
ICE650I 0 VISIT http://www.ibm.com/storage/dfsort FOR ICETOOL PAPERS,
ICE632I 0 SOURCE FOR ICETOOL STATEMENTS: TOOLIN
ICE630I 0 MODE IN EFFECT: STOP
* Reformat IN1 records to:
* IN1 key/data|11|blanks..|
COPY FROM(IN1) TO(T1) USING(CTL1)
ICE606I 0 DFSORT CALL 0001 FOR COPY FROM IN1 TO T1 USING CTL1CNTL
ICE602I 0 OPERATION RETURN CODE: 00
* Reformat IN2 records to:
* IN2 key/data|22|IN2 data|
COPY FROM(IN2) TO(T1) USING(CTL2)
ICE606I 0 DFSORT CALL 0002 FOR COPY FROM IN2 TO T1 USING CTL2CNTL
ICE602I 0 OPERATION RETURN CODE: 00
* SPLICE records on key to get:
* IN1 key/data|id|IN2 data|
* id = 21 if key is in file1 and file2
* 11 if key is in file1 only
* 22 if key is in file2 only
SPLICE FROM(T1) TO(OUT1) -
ON(98,32,CH) -
ON(230,32,CH) -
ON(296,32,CH) -
WITH(8001,1) WITH(8005,7639) -
KEEPNODUPS USING(CTL3)
ICE606I 0 DFSORT CALL 0003 FOR SORT FROM T1 TO OUT1 USING CTL3CNTL TER
ICE602I 0 OPERATION RETURN CODE: 16
ICE200I 0 IDENTIFIER FROM CALLING PROGRAM IS 0001
ICE143I 0 BLOCKSET COPY TECHNIQUE SELECTED
ICE250I 0 VISIT http://www.ibm.com/storage/dfsort FOR DFSORT PAPERS, EXA
ICE000I 0 - CONTROL STATEMENTS FOR 5694-A01, Z/OS DFSORT V1R5 - 10:55 ON
OUTREC FIELDS=(1,8000,8001:C'11',16005:X)
ICE146I 0 END OF STATEMENTS FROM CTL1CNTL - PARAMETER LIST STATEMENTS FO
DEBUG NOABEND,ESTAE
OPTION MSGDDN=DFSMSG,LIST,MSGPRT=ALL,RESINV=0,SORTDD=CTL1,SORT
RTOUT=T1,DYNALLOC
SORT FIELDS=COPY
ICE201I 0 RECORD TYPE IS F - DATA STARTS IN POSITION 1
ICE162I 0 ICEIEXIT CHANGED ONE OR MORE OPTIONS IN EFFECT
ICE751I 0 C5-K05352 C6-Q95214 C7-K90000 C8-K05352 E9-K06751 C9-BASE E5
ICE193I 0 ICEAM2 ENVIRONMENT IN EFFECT - ICEAM2 INSTALLATION MODULE SELE
ICE088I 0 SRZPIJFK.S1 . , INPUT LRECL = 8000, BLKSIZE = 3200
ICE093I 0 MAIN STORAGE = (MAX,4194304,4189278)
ICE156I 0 MAIN STORAGE ABOVE 16MB = (4098889,4098889)
ICE127I 0 OPTIONS: OVFLO=RC0 ,PAD=RC0 ,TRUNC=RC0 ,SPANINC=RC16,VLSCMP=N,
ICE128I 0 OPTIONS: SIZE=4194304,MAXLIM=1048576,MINLIM=450560,EQUALS=N,LI
ICE129I 0 OPTIONS: VIO=Y,RESDNT=ALL ,SMF=NO ,WRKSEC=Y,OUTSEC=Y,VERIFY=
ICE130I 0 OPTIONS: RESALL=4096,RESINV=0,SVC=109 ,CHECK=Y,WRKREL=Y,OUTREL
ICE131I 0 OPTIONS: TMAXLIM=4194304,ARESALL=0,ARESINV=0,OVERRGN=16384,CIN
ICE132I 0 OPTIONS: VLSHRT=N,ZDPRINT=N,IEXIT=Y,TEXIT=N,LISTX=N,EFS=NONE
ICE133I 0 OPTIONS: HIPRMAX=OPTIMAL,DSPSIZE=50 ,ODMAXBF=0,SOLRF=Y,VLLONG
ICE235I 0 OPTIONS: NULLOUT=RC0
ICE084I 0 EXCP ACCESS METHOD USED FOR T1
ICE084I 0 BSAM ACCESS METHOD USED FOR IN1
ICE751I 1 EF-K90000 F0-Q84357 E8-K90000
ICE090I 0 OUTPUT LRECL = 16005, BLKSIZE = 16005, TYPE = FB (SDB)
ICE171I 0 SORTOUT LRECL OF 16005 IS DIFFERENT FROM SORTIN(NN) LRECL OF 8
ICE055I 0 INSERT 0, DELETE 0
ICE054I 0 RECORDS - IN: 3, OUT: 3
ICE052I 0 END OF DFSORT
ICE200I 0 IDENTIFIER FROM CALLING PROGRAM IS 0002
ICE143I 0 BLOCKSET COPY TECHNIQUE SELECTED
ICE250I 0 VISIT http://www.ibm.com/storage/dfsort FOR DFSORT PAPERS, EXA
ICE000I 0 - CONTROL STATEMENTS FOR 5694-A01, Z/OS DFSORT V1R5 - 10:55 ON
OUTREC FIELDS=(1,8000,8001:C'22',8005:362,7639)
ICE146I 0 END OF STATEMENTS FROM CTL2CNTL - PARAMETER LIST STATEMENTS FO
DEBUG NOABEND,ESTAE
OPTION MSGDDN=DFSMSG,LIST,MSGPRT=ALL,RESINV=0,SORTDD=CTL2,SORT
RTOUT=T1,DYNALLOC
SORT FIELDS=COPY
ICE201I 0 RECORD TYPE IS F - DATA STARTS IN POSITION 1
ICE162I 0 ICEIEXIT CHANGED ONE OR MORE OPTIONS IN EFFECT
ICE751I 0 C5-K05352 C6-Q95214 C7-K90000 C8-K05352 E9-K06751 C9-BASE E5
ICE193I 0 ICEAM2 ENVIRONMENT IN EFFECT - ICEAM2 INSTALLATION MODULE SELE
ICE088I 0 SRZPIJFK.S1 . , INPUT LRECL = 8000, BLKSIZE = 3200
ICE093I 0 MAIN STORAGE = (MAX,4194304,4189278)
ICE156I 0 MAIN STORAGE ABOVE 16MB = (4099251,4099251)
ICE127I 0 OPTIONS: OVFLO=RC0 ,PAD=RC0 ,TRUNC=RC0 ,SPANINC=RC16,VLSCMP=N,
ICE128I 0 OPTIONS: SIZE=4194304,MAXLIM=1048576,MINLIM=450560,EQUALS=N,LI
ICE129I 0 OPTIONS: VIO=Y,RESDNT=ALL ,SMF=NO ,WRKSEC=Y,OUTSEC=Y,VERIFY=
ICE130I 0 OPTIONS: RESALL=4096,RESINV=0,SVC=109 ,CHECK=Y,WRKREL=Y,OUTREL
ICE131I 0 OPTIONS: TMAXLIM=4194304,ARESALL=0,ARESINV=0,OVERRGN=16384,CIN
ICE132I 0 OPTIONS: VLSHRT=N,ZDPRINT=N,IEXIT=Y,TEXIT=N,LISTX=N,EFS=NONE
ICE133I 0 OPTIONS: HIPRMAX=OPTIMAL,DSPSIZE=50 ,ODMAXBF=0,SOLRF=Y,VLLONG
ICE235I 0 OPTIONS: NULLOUT=RC0
ICE084I 0 EXCP ACCESS METHOD USED FOR T1
ICE084I 0 BSAM ACCESS METHOD USED FOR IN2
ICE751I 1 EF-K90000 F0-Q84357 E8-K90000
ICE090I 0 OUTPUT LRECL = 16005, BLKSIZE = 16005, TYPE = FB
ICE171I 0 SORTOUT LRECL OF 16005 IS DIFFERENT FROM SORTIN(NN) LRECL OF 8
ICE055I 0 INSERT 0, DELETE 0
ICE054I 0 RECORDS - IN: 3, OUT: 3
ICE052I 0 END OF DFSORT
ICE200I 0 IDENTIFIER FROM CALLING PROGRAM IS 0003
ICE143I 0 BLOCKSET SORT TECHNIQUE SELECTED
ICE250I 0 VISIT http://www.ibm.com/storage/dfsort FOR DFSORT PAPERS, EXA
ICE000I 0 - CONTROL STATEMENTS FOR 5694-A01, Z/OS DFSORT V1R5 - 10:55 ON
* If id = 21 (record in both files) and 362-8000 from file1 does
* not match 362-8000 from file2, write record to file1.
OUTFIL FNAMES=OUT1,
INCLUDE=(8001,2,CH,EQ,C'21',AND,362,7638,CH,NE,8005,7638,CH),
* Remove IN2 key/data
OUTREC=(1,8000)
* If id = 22 (record in file2 only) or
* id=21 (record in both files) and 362-8000 from file1 does
* not match 362-8000 from file2, write record to file2.
OUTFIL FNAMES=OUT2,
INCLUDE=(8001,2,CH,EQ,C'22',OR,
(8001,2,CH,EQ,C'21',AND,362,7638,CH,NE,8005,7638,CH)),
OUTREC=(1,8000)
ICE146I 0 END OF STATEMENTS FROM CTL3CNTL - PARAMETER LIST STATEMENTS F
DEBUG NOABEND,ESTAE
OPTION MSGDDN=DFSMSG,LIST,MSGPRT=ALL,RESINV=0,SORTDD=CTL3,SOR
TOUT=OUT1,DYNALLOC,SZERO,EQUALS,NOVLSHRT,LOCAL
HECK
SORT FIELDS=(98,32,CH,A,230,32,CH,A,296,32,CH,A)
MODS E35=(ICE35DU,12288)
ICE201I 0 RECORD TYPE IS F - DATA STARTS IN POSITION 1
ICE162I 0 ICEIEXIT CHANGED ONE OR MORE OPTIONS IN EFFECT
ICE221A 1 INVALID FIELD OR CONSTANT IN OUT1 IFTHEN 0 CONDITION 2
ICE751I 0 C5-K05352 C6-Q95214 C7-K90000 C8-K05352 E4-K06751 E7-K90000
ICE052I 3 END OF DFSORT
Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
Posted: Thu Feb 23, 2006 12:16 pm Post subject:
Shuko,
When you get a DFSORT error message, the first thing you should do is look it up. You can use LOOKAT to do that, or go directly to "z/OS DFSORT Messages, Codes and Diagnosis" which you can access online from:
One of the following errors was detected:
o the length for a field with a format other than SS was greater than 256
That's the reason for the ICE221A - you have 362,7638,CH,NE,8005,7638,CH so both lengths are 7638 which is greater than 256. You'll need to break this up into 256 byte segments. I'd suggest using DFSORT Symbols to make this easier. Add:
_________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum