//SYSIN DD *
SORT FIELDS=(1,5,CH,A,6,75,CH,A)
OUTFIL REMOVECC,NODETAIL,
BUILD=(1,80,96:X), * WE GO FROM LRECL=80 TO LRECL=96
SECTIONS=(1,5,6,75,
TRAILER3=(1:1,5,6,75,
81:COUNT=(EDIT=(TTTTTTTT)),
89:COUNT=(EDIT=(TTTTTTTT))))
/*
But it shows twice the same count. Kind Regards, Gerd.
Joined: 26 Nov 2002 Posts: 12375 Topics: 75 Location: San Jose
Posted: Fri Apr 15, 2016 10:13 am Post subject:
Gerd Hofmans,
You cannot have multiple key break counts with the SECTIONS. So you need to use the trick of JOINKEYS where you read the same file for both INA and INB and use the second file to generate the key count for your primary key. Once you got the primary key count then you can use the sections to generate the key counters.
Is your data already sorted? If so you don't need to use SORT FIELDS=(1,5,CH,A,6,75,CH,A) on the main task.
Use the following JCL which will give you the desired results.
OUTFIL REMOVECC,NODETAIL,
BUILD=(96X), * WE GO FROM LRECL=80 TO LRECL=96
SECTIONS=(01,80,
TRAILER3=(01:01,88,
89:COUNT=(EDIT=(TTTTTTTT))))
//*
//JNF2CNTL DD *
INREC BUILD=(1,5, * PRIMARY KEY
X'0000001C') * VALUE OF 1 IN PACKED FORMAT
Thanks Kolusu!
After some testing, i decided to split the operations into 2 steps because in 1 step it was taking too much CPU (it's a verly large file, and i start with splitting it into 30 subfiles of > 90Million records each). I did this :
Btw. the example i gave first was just to indicate what i wanted to do, the input and output are different, but to my opinion, has no effect on the logic.
Again, thank you for your much appreciated effort and answer.
Kind Regards, Gerd.
You still have the SORT in the main task. All your data is already in that order. Since you have a later BUILD, you could consider F1:1,38,F2:6,4 on the REFORMAT and changing the locations on the BUILD.
It is a pity that your original data is not in sequence. Your sample data shows it in sequence. Is it "somewhat in sequence" ie all the data within key contiguous and in sequence, just the main keys not in key sequence?
Joined: 26 Nov 2002 Posts: 12375 Topics: 75 Location: San Jose
Posted: Tue Apr 19, 2016 7:48 pm Post subject:
Gerd Hofmans,
As William pointed out, you do NOT need a SORT on the JOINKEYS main task as the data is already sorted out. Also I do not see a point of having the count in PD format. I used the PD format as I was performing a sort and you don't have to.
Since your sort fields are contiguous fields, there is no point in splitting them. Simply sort it as a single field. Also use INREC to reduce the sortwk/memory required to just the data you need.
Hi Kolusu & Bill,
Many thanks for your suggestions (that i carefully implemented).
to answer Bill's question : The input data is not sorted.
Just for the record : Sorting the entire input file uses a lot more CPU then splitting and sorting afterwards. i guess this has to do with the region i submit jobs in. Also, this file contains 2.5 Billion records, and needs to be sorted and counted. in a straightforward fashion, using the entire input file, the sorting and counting uses 31minutes of CPU. Splitting, sorting and counting uses 18 minutes of CPU.
Many thanks and kind regards, Gerd.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum