MVSFORUMS.com

cobcurious · Beginner Joined: 04 Oct 2003 Posts: 68 Topics: 25

Hi Experts,

We have a job that has 4 DFSORT steps where each step feeds the one after it. We are trying to quicken the entire process by using batch pipes.
Current job:
DFSORT1
DFSORT2
DFSORT3
DFSORT4

To do so, we have split the 4 steps into 4 separate jobs containing single step.
Job#1
DFSORT1

Job#2
DFSORT2

Job#3
DFSORT3

Job#4
DFSORT4

We observe that all the jobs do not start at the same time. There is a certain wait period or lag after Job#1 starts. I would like to understand why is this happening?

Also, can someone please advise if SORT steps are good candidiates for the use of Batch pipes?

papadi · Supermod Joined: 20 Oct 2009 Posts: 594 Topics: 1

It is not appropriate to post the same question in multiple forums.

As was told before (another forum), using batch pipes is not only NOT a good candidate for this, but it won't do what you want anyway.

Someone needs to look at why there are 4 steps that "feed the next". Is it because no one knew how to design the process so that one read would generate the actually needed output. . .?

Is you explan what the 4 steps accomplish, it is most likely that there is a more efficient way to get to the answer that is actually needed. . .
_________________
All the best,

di

kolusu · Posted: Sat Aug 28, 2010 1:34 pm Post subject:

cobcurious,

Show us your DFSORT jobs and may be we can fine tune the jobs.
_________________
Kolusu
www.linkedin.com/in/kolusu

cobcurious · Beginner Joined: 04 Oct 2003 Posts: 68 Topics: 25

Hello Kolusu/Di,

Thanks for your comments. I will provide the details soon.

RonB · Posted: Mon Aug 30, 2010 8:17 am Post subject:

When batch pipes are used, a receiving job must wait until data starts being WRITTEN to the batch pipe before it can start READING data from the batch pipe. Hence, Job#2 cannot start READING until Job#1 starts WRITING, etc.

Your Job #1 is a sort. If a MERGE operation is being done, then WRITING will begin almost immediately. Same with a COPY operation.
But, if SORTING/SUMMING/BUILDING, etc. is required, then there can be a substantial delay between READING and WRITING while the actual SORTING/SUMMING, etc. occurs. SORTing may take several passes and involve a great deal of I/O to SORTWK datasets. The actual WRITING to SORTOUT will not occur until the FINAL sort phase. The amount of the delay will depend on the amount of data being sorted, and the complexity of the sort.

Note: Batch pipe delays are not just a SORT issue - an extensive delay might occur for a batch COBOL/DB2 program, for example, where a long-running DB2 Query must complete before the program can begin to write output data.
_________________
A computer once beat me at chess, but it was no match for me at kick boxing.

cobcurious · Beginner Joined: 04 Oct 2003 Posts: 68 Topics: 25

Hi All,

Thanks so much for all the information shared. Here are the descriptions about the SORT steps in more detail:

1. DFSORT1 - Here a concatenated input of 9 input files (containing more than a million records) is given. The data in the output is sorted on two character fields. The data is sorted in ascending order on one field and descending on the other field.

2. DFSORT2 - Removed duplicates on the first 30 characters of the output dataset from the previous step.

3. DFSORT3 - Sorts on two fields, in ascending order, from the output of previous step. One of the field is character and other is BI type.

4. DFSORT4 - Attached a header and trailer to the output file from the previous step.Here MODS E35 = (BAC060,16384,MOD LR) is used in the control card to do so.

Let me know if you need any more information.

Thanks in advance.

RonB · Posted: Tue Aug 31, 2010 7:19 am Post subject:

It would help if you provided:
1) The input RECFM and LRECL of the SORTIN file to the DFSORT1 job
2) The SYSIN Sort Statements to the DFSORT1 job
3) The SYSIN Sort Statements to the DFSORT2 job
4) The SYSIN Sort Statements to the DFSORT3 job
5) The SYSIN Sort Statements to the DFSORT4 job
6) A general description of what the E35 mod does - e.g. is it just adding a fixed header and a trailer to the file or is it doing a lot of other work. Is the header/trailer information provided by program constants/computations, or via an input file that the E35 module reads/accepts? Are you computing trailer counts/totals? etc.
_________________
A computer once beat me at chess, but it was no match for me at kick boxing.

papadi · Supermod Joined: 20 Oct 2009 Posts: 594 Topics: 1

Currently, it looks like the entire input will be processed 4 times (minus whatever dups). I believe you can accomplish what you want with a single pass of the input file and a single write of the sorted, de-duplicated data. . .

Why was this 4-step approach chosen. . . Confused

How long does each of the 4 processes currently run?

My "smaller" files are 8-10 millions records of more than 14k bytes and this type of process typically takes a few minutes (once migrated data is recalled) . . .
_________________
All the best,

di

RonB · Posted: Tue Aug 31, 2010 4:08 pm Post subject:

That was my thinking as well, papadi. That's why I asked to see the sort control statements from each of the four runs - in order to see if "consolidation" might alleviate the need for at least some of those steps, and perhaps an opportunity to "tune" the steps that do need to be run.
_________________
A computer once beat me at chess, but it was no match for me at kick boxing.

papadi · Supermod Joined: 20 Oct 2009 Posts: 594 Topics: 1

This sounds very much like someone heard of a "solution" and went looking for a requirement that would use it.

This type of requirement is quite common (i've seen a few hundred) and it does not require several steps. Many of the systems i've been involved with would not accept this job for promotion because of the high amount of resources it would waste making the redundant copies of the data.

Maybe someone in the IT department also owns the hardware concession for the facility 8)
_________________
All the best,

di

cobcurious · Beginner Joined: 04 Oct 2003 Posts: 68 Topics: 25

Hi,

Please find here the information about the different jobs:

JOB#1
SORT FIELDS=(1,30,CH,A,397,26,CH,D)

JOB#2
SORT FIELDS=(1,30,CH,A),EQUALS
SUM FIELDS=NONE
END

JOB#3
SORT FIELDS=(1,8,CH,A,70,2,BI,A,13,8,CH,A)
END

JOB#4
MODS E35=(Module name,16384,MODLIB)
END

Do you spot any improvements in the process? The current process takes more than 60 minutes. The bottlneck is at the 3rd and 2nd JOB.

RonB · Posted: Thu Sep 02, 2010 8:06 am Post subject:

Unless you will NEED the output of JOB#1, JOB#2, and JOB#3 at some later date/time, I beleive that you can combine all 4 jobs into 1 job by using ICETOOL (note: the following is NOT tested, and only represents what I THINK would work for this situation):

cobcurious · Beginner Joined: 04 Oct 2003 Posts: 68 Topics: 25

Hi RonB,

Thanks so much for the information. Let me try it out and get back to you with the results.

kolusu · Posted: Thu Sep 02, 2010 10:14 am Post subject:

RonB,

Your second SELECT is NOT a valid statement. I guess you meant SORT instead of Select.

cobcurious,

Use the following DFSORT JCL

RonB · Posted: Thu Sep 02, 2010 10:58 am Post subject:

kolusu,
Yes, the second statement should have been SORT, not SELECT. Thanks for the correction.
Just out of curiosity, why did you add the EQUALS parameter to the last sort statement, and remove it from the first sort statement? My impression was that there could be duplicates on 1,30,A AND on 397,26,D, but that it was important to only keep the FIRST record of any such set ( for the sake of the OTHER fields in the record ).
But there was no such requirement to keep the EQUALS order in the third sort (input to the last step ).
_________________
A computer once beat me at chess, but it was no match for me at kick boxing.