MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Split input file for record groups

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
Kamur
Beginner


Joined: 19 Dec 2002
Posts: 4
Topics: 3

PostPosted: Mon Aug 02, 2004 7:23 pm    Post subject: Split input file for record groups Reply with quote

Hello,

My input file is FB 5000. The input records consists of groups and are unsorted. That is,the records are like:

Code:

HEADERAAAAAAAAAAA...............
HEADERBBBBBBBBBBBB..............
HEADERCCCCCCCCCCC...............
HEADEREEEEEEEEEE................
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
HEADERDDDDDDDDDDA...............
DDDDDDDDDDASDALLLLLLLLLL........
FFFFFFFFFDDDDDDDD...............
HEADERJJKLKJKKKKKKK.............
JJJJJJJJJJJJJJJJJJJJ............



As you can see, each new record group is identified by 'HEADER' at first byte. For example, in the sample above,
Code:

HEADEREEEEEEEEEE................
EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE


is a record group. (Records starting with 'HEADER' till the next record starting with 'HEADER' forms a record group)


My input file is huge. My need is to split the input file into 25 smaller files, with each smaller file consists of 100 record groups.

For discussion purposes in this forum, lets assume I need to split the file into 2 smaller files only, with each file to consist of 10 record groups. To achieve this, I had coded the following sort:

Code:

//STEP0100 EXEC PGM=SYNCTOOL                                   
//*                                                             
//TOOLMSG   DD SYSOUT=*                                         
//DFSMSG    DD SYSOUT=*                                         
//SYSPRINT  DD SYSOUT=*                                         
//IN        DD DSN=BJ67FT.X114804.ADHOC.SEC.BARCDTST.BKP,       
//             DISP=SHR                                         
//T1        DD DSN=&T1,DISP=(,PASS),SPACE=(CYL,(150,50),RLSE)   
//T2        DD DSN=&T2,DISP=(,PASS),SPACE=(CYL,(150,50),RLSE)   
//CON       DD DSN=*.T1,VOL=REF=*.T1,DISP=(OLD,PASS)           
//          DD DSN=*.T2,VOL=REF=*.T2,DISP=(OLD,PASS)           
//OUT       DD DSN=BJ67FT.DEVDAP.SEQNUM.DAPFEED.OUT,           
//             DCB=(RECFM=FB,LRECL=5016,BLKSIZE=0),             
//             UNIT=DASD,SPACE=(5016,(150,50),RLSE),AVGREC=K,   
//             DISP=(,CATLG),RETPD=20                           
//FILE1     DD DSN=BJ67FT.DEVDAP.SEQNUM.DAPFEED.FILE1,       
//             DCB=(RECFM=FB,LRECL=5000,BLKSIZE=0),           
//             UNIT=DASD,SPACE=(214,(150,50),RLSE),AVGREC=K, 
//             DISP=(,CATLG),RETPD=20                         
//FILE2     DD DSN=BJ67FT.DEVDAP.SEQNUM.DAPFEED.FILE2,       
//             DCB=(RECFM=FB,LRECL=5000,BLKSIZE=0),           
//             UNIT=DASD,SPACE=(214,(150,50),RLSE),AVGREC=K, 
//             DISP=(,CATLG),RETPD=20                         
//TOOLIN    DD *                                             
 COPY FROM(IN) USING(CTL1)                                   
 SORT FROM(CON) TO(OUT) USING(CTL2)                           
 COPY FROM(OUT) USING(CTL3)                                   
//CTL1CNTL  DD *                                             
 INREC FIELDS=(1,5000,SEQNUM,8,ZD,C'        ')               
 OUTFIL FNAMES=T1,INCLUDE=(1,6,CH,EQ,C'HEADER'),             
 OUTREC=(1,5008,SEQNUM,8,ZD)                                 
 OUTFIL FNAMES=T2,SAVE                                       
//CTL2CNTL  DD *                                             
 SORT FIELDS=(5001,8,ZD,A)                                   
 OUTREC FIELDS=(1,5016)                                       
//CTL3CNTL DD *                                               
 OUTFIL FNAMES=FILE1,INCLUDE=(5009,8,CH,LT,C'00000010'), 
 OUTREC=(1,5000)                                         
 OUTFIL FNAMES=FILE2,INCLUDE=(5009,8,CH,GE,C'00000010',   
 AND,5009,8,CH,LT,C'00000020'),                           
 OUTREC=(1,5000)                                         
/*                 



The processing of CTL1 and CTL2 seems to be fine and as expected. But CTL3 does not produce the expected result. As you can see, CTL1 and CTL2 adds separate sequence numbers to each record and each record group. I was expecting that now with CTL3, I need to just check the sequence number for each record group and write into two files. Apparently, it seems I am wrong with CTL3 in that that the sequence number for every other record that is not a 'HEADER', the value is spaces and all those records go to FILE1. Please let me know if my question is not clear.

What I would like to know is how can I modify CTL3 (or this sort itself), so that I can separate the input file into 2 smaller files each having 10 record groups.

I have syncsort at my shop, but I think the same will work with dfsort also. So, I am interested to see a dfsort solution too.

Thanks
Kamur
Back to top
View user's profile Send private message Send e-mail
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12375
Topics: 75
Location: San Jose

PostPosted: Tue Aug 03, 2004 5:30 am    Post subject: Reply with quote

kamur,

I quickly glanced thru your job, and I don't think it will produce the desired results. You will need STARTREC and ENDREC parms. I will try to post something when I get to work.

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12375
Topics: 75
Location: San Jose

PostPosted: Tue Aug 03, 2004 10:34 am    Post subject: Reply with quote

kamur,

It is getting too complicated to generate the startrec and endrec for all the files. It is involving at 4 passes of the data. Even though I am a big fan of SORT, I prefer a program for this kind of splitting. And with easytrieve it is just a max of 30 lines of source code.

Here is an easytrieve solution to split the records into different groups

Code:

//STEP0100 EXEC PGM=EZTPA00                 
//STEPLIB  DD DSN=EASYTREV.LOADLIB,
//            DISP=SHR                       
//SYSPRINT DD SYSOUT=*                       
//SYSOUT   DD SYSOUT=*                       
//SYSSNAP  DD SYSOUT=*                       
//SYSUDUMP DD SYSOUT=*                       
//INFILE   DD DSN=YOUR INPUT FILE,
//            DISP=SHR                           
//OUTPUT01 DD DSN=YOUR FIRST GROUP RECORDS,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(X,Y),RLSE),
//            DCB=YOUR INPUT FILE
//OUTPUT02 DD DSN=YOUR SECOND GROUP RECORDS,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(X,Y),RLSE),
//            DCB=YOUR INPUT FILE
//OUTPUT03 DD DSN=YOUR SECOND GROUP RECORDS,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(X,Y),RLSE),
//            DCB=YOUR INPUT FILE
...
//SYSIN    DD *                       
                                     
  FILE INFILE                         
       REC-IND   01 06 A             
                                     
  FILE OUTPUT01  FB(0 0)             
  FILE OUTPUT02  FB(0 0)             
  FILE OUTPUT03  FB(0 0)             
  FILE OUTPUT04  FB(0 0)             
  FILE OUTPUT05  FB(0 0)             
....

  W-HDR-COUNT    W 08 N 0 VALUE 0     
                                     
********************************************************
* MAINLINE                                             *                 
********************************************************
                                                       
 JOB INPUT INFILE                                       
                                                       
     IF REC-IND      = 'HEADER'                         
        W-HDR-COUNT  = W-HDR-COUNT + 1                 
     END-IF                                             
                                                       
     IF W-HDR-COUNT < 101                                 
        PUT OUTPUT01 FROM INFILE                       
     END-IF                                             
                                                       
     IF W-HDR-COUNT >= 101 AND W-HDR-COUNT < 201           
        PUT OUTPUT02 FROM INFILE                       
     END-IF                                             
                                                       
     IF W-HDR-COUNT >= 201 AND W-HDR-COUNT < 301         
        PUT OUTPUT03 FROM INFILE                       
     END-IF                                             
                                                       
     IF W-HDR-COUNT >= 301 AND W-HDR-COUNT < 401         
        PUT OUTPUT03 FROM INFILE                       
     END-IF                                             

.... code for all your output files
                                   
/*


Hope this helps...

Cheers

Kolusu

PS: If your shop does not have easytrieve, the same logic can be coded in cobol also.
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Kamur
Beginner


Joined: 19 Dec 2002
Posts: 4
Topics: 3

PostPosted: Tue Aug 03, 2004 12:27 pm    Post subject: Reply with quote

Kolusu,

After seeing your post today, I also tried coding using startrec, endrec. I was halfway thru coding it and I saw your post again. Yes, I agree it is getting too complicated to build dynamic cards for each split...Unfortunately, I dont have eztrieve at my shop...but guess what??

When there is no easy SORT solution, FILEAID always comes into picture.
And when there is no easy FILEAID solution, SORT always comes to rescue.

I added another FILEAID step and resolved it pretty easily. So now the job is done in 2 steps: First a Synctool step and second a simple Fileaid.

Here's my solution:

Code:


//STEP0100 EXEC PGM=SYNCTOOL                                   
//*                                                               
//TOOLMSG   DD SYSOUT=*                                           
//DFSMSG    DD SYSOUT=*                                           
//SYSPRINT  DD SYSOUT=*                                           
//IN        DD DSN=BJ67FT.X114804.ADHOC.SEC.BARCDTST.BKP,         
//             DISP=SHR                                           
//T1        DD DSN=&T1,DISP=(,PASS),SPACE=(CYL,(150,50),RLSE)     
//T2        DD DSN=&T2,DISP=(,PASS),SPACE=(CYL,(150,50),RLSE)     
//CON       DD DSN=*.T1,VOL=REF=*.T1,DISP=(OLD,PASS)             
//          DD DSN=*.T2,VOL=REF=*.T2,DISP=(OLD,PASS)             
//OUT       DD DSN=BJ67FT.DEVDAP.SEQNUM.DAPFEED.OUT,             
//             DCB=(RECFM=FB,LRECL=5016,BLKSIZE=0),               
//             UNIT=DASD,SPACE=(5016,(150,50),RLSE),AVGREC=K,     
//             DISP=(,CATLG),RETPD=20                             
//TOOLIN    DD *                                                 
 COPY FROM(IN) USING(CTL1)                                       
 SORT FROM(CON) TO(OUT) USING(CTL2)                               
//CTL1CNTL  DD *                                                 
 INREC FIELDS=(1,5000,SEQNUM,8,ZD,C'        ')                   
 OUTFIL FNAMES=T1,INCLUDE=(1,6,CH,EQ,C'HEADER'),     
 OUTREC=(1,5008,SEQNUM,8,ZD)                         
 OUTFIL FNAMES=T2,SAVE                               
//CTL2CNTL  DD *                                     
 SORT FIELDS=(5001,8,ZD,A)                           
 OUTREC FIELDS=(1,5016)                               
/*                                                   
//*-------------------------------------------------
//FAID0002 EXEC PGM=FILEAID                                       
//SYSUDUMP DD SYSOUT=*                                             
//SYSLIST  DD SYSOUT=*                                             
//MSGDX    DD SYSOUT=*                                             
//SYSPRINT DD SYSOUT=*                                             
//DD01   DD DSN=BJ67FT.DEVDAP.SEQNUM.DAPFEED.OUT,DISP=SHR         
//FILE1  DD DSN=BJ67FT.DEVDAP.SEQNUM.DAPFEED.FILE1,               
//          DCB=(RECFM=FB,LRECL=5000,BLKSIZE=0),                   
//          DISP=(,CATLG),UNIT=DASD,RETPD=60,                     
//          SPACE=(5000,(50,10),RLSE),AVGREC=K                     
//FILE2  DD DSN=BJ67FT.DEVDAP.SEQNUM.DAPFEED.FILE2,               
//          DCB=(RECFM=FB,LRECL=5000,BLKSIZE=0),                   
//          DISP=(,CATLG),UNIT=DASD,RETPD=60,                     
//          SPACE=(5000,(50,10),RLSE),AVGREC=K                     
//SYSIN DD *                                                       
$$DD01 USER STOP=(5009,EQ,C'00000011'),MOVE=(1,0,1),WRITE=FILE1   
$$DD01 USER STOP=(5009,EQ,C'00000021'),MOVE=(1,0,1),WRITE=FILE2   
//*--------------------------------------------------------------------



We can keep adding however splits we need like this in the fileaid, it is pretty easy. Just one line of card for one file Smile.

Just to add....MVSFORUMS have been great since its beginning and I have been an active browser of this site. Incase if anyone is wondering, I was registered to this site on Dec 2002 and this is my first question/post on this site. Guess why??? I always search, I always go through manuals, I always find solutions in this site (and another similar site). This forum is great and keep up the great work. I am as active as anyone can be on mvsforum site.

Regards
Kamur
Back to top
View user's profile Send private message Send e-mail
coolman
Intermediate


Joined: 03 Jan 2003
Posts: 283
Topics: 27
Location: US

PostPosted: Mon Aug 09, 2004 6:49 pm    Post subject: Reply with quote

Kamur,

Does the solution you have proposed solve your purpose. If yes, how do you ensure that a "record group" is completely contained within the O/P file.

Cheers,
Coolman
________
Impreza WRX WRP10
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group