MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Split the Input file
Goto page 1, 2  Next
 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
Rahull
Beginner


Joined: 29 Jan 2004
Posts: 62
Topics: 19

PostPosted: Wed Jun 09, 2004 2:24 am    Post subject: Split the Input file Reply with quote

Hi,

I want to achieve the below problem using SORT/ICETOOL Utility.

--I have a input file and I can't predict the number of records in it...It varies 0 - 20k....

I need to split the input file into 5 different files with following requirements::
---- Count the total number of records and divide it by 5 and put accordingly the records in each of 5 different files..
----- I dont want to change the order of records...(As I tried with SPLIT and it changes the order in 5 different files. i.e. it pick 1 record,put in 1st file, 2nd record , put in 2nd file and so on...) My requirement is like...
suppose I have got 100 records as input...i need to put first 20 in first file, another 20 in second file and so on...

---- if the total number of records is not divided equally by 5..then put the extra records in the last file...
Back to top
View user's profile Send private message
Rahull
Beginner


Joined: 29 Jan 2004
Posts: 62
Topics: 19

PostPosted: Wed Jun 09, 2004 5:40 am    Post subject: Reply with quote

No Reply Confused((

Can we acheive the same in multiple steps ???
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12376
Topics: 75
Location: San Jose

PostPosted: Wed Jun 09, 2004 5:57 am    Post subject: Reply with quote

Rahul,

The following DFSORT/ICETOOL JCL will give you the desired results. You need to have the latest version of DFSORT ( I forgot the PTF) for the horizontal math functiuons to work.If you have syncsort at your shop then change the pgm name synctool. A brief explanation of the Job. The first copy operator takes the input file and creates a record with the total no: of records in the input file.

Then we take this count file and create dynamic control cards to split the record.

The third copy step takes in the dynamic control cards and splitts the file into 5 files.

I did not have a chance to test the job, so bear with me for syntax errors

Code:

//STEP0100 EXEC PGM=ICETOOL                               
//TOOLMSG   DD SYSOUT=*                                   
//DFSMSG    DD SYSOUT=*                                   
//IN        DD DSN=YOUR INPUT FILE,
//             DISP=SHR             
//T1        DD DSN=&T1,DISP=(,PASS),SPACE=(TRK,(1,1),RLSE)
//OUT1      DD DSN=YOUR OUTPUT FILE 1,
//             DISP=(NEW,CATLG,DELETE),
//             UNIT=SYSDA,
//             SPACE=(CYL,(X,Y,)RLSE)
//OUT2      DD DSN=YOUR OUTPUT FILE 2,
//             DISP=(NEW,CATLG,DELETE),
//             UNIT=SYSDA,
//             SPACE=(CYL,(X,Y,)RLSE)
//OUT3      DD DSN=YOUR OUTPUT FILE 3,
//             DISP=(NEW,CATLG,DELETE),
//             UNIT=SYSDA,
//             SPACE=(CYL,(X,Y,)RLSE)
//OUT4      DD DSN=YOUR OUTPUT FILE 4,
//             DISP=(NEW,CATLG,DELETE),
//             UNIT=SYSDA,
//             SPACE=(CYL,(X,Y,)RLSE)
//OUT5      DD DSN=YOUR OUTPUT FILE 5,
//             DISP=(NEW,CATLG,DELETE),
//             UNIT=SYSDA,
//             SPACE=(CYL,(X,Y,)RLSE)
//TOOLIN    DD *                                   
  COPY FROM(IN) USING(CTL1)                         
  COPY FROM(T1) USING(CTL2)                         
  COPY FROM(IN) USING(CTL3)                         
//CTL1CNTL  DD *                                   
  OUTFIL FNAMES=T1,NODETAIL,REMOVECC,TRAILER1=(COUNT)       
//CTL2CNTL  DD *                                   
  INREC FIELDS=(1,8,FS,DIV,+5,EDIT=(TTTTTTTT))     
  OUTFIL FNAMES=CTL3CNTL,                           
  OUTREC=(C' OUTFIL FNAMES=OUT1,ENDREC=',1,8,/,               
          C' OUTFIL FNAMES=OUT2,STARTREC=',                   
          +1,ADD,1,8,ZD,EDIT=(TTTTTTTT),C',ENDREC=',         
          +2,MUL,1,8,ZD,EDIT=(TTTTTTTT),/,                   
          C' OUTFIL FNAMES=OUT3,STARTREC=',                   
          +1,ADD,(+2,MUL,1,8,ZD),EDIT=(TTTTTTTT),C',ENDREC=',
          +3,MUL,1,8,ZD,EDIT=(TTTTTTTT),/,                   
          C' OUTFIL FNAMES=OUT4,STARTREC=',                   
          +1,ADD,(+3,MUL,1,8,ZD),EDIT=(TTTTTTTT),C',ENDREC=',
          +4,MUL,1,8,ZD,EDIT=(TTTTTTTT),/,                   
          C' OUTFIL FNAMES=OUT5,STARTREC=',                   
          +1,ADD,(+4,MUL,1,8,ZD),EDIT=(TTTTTTTT),80:X)       
//CTL3CNTL  DD DSN=&C1,DISP=(,PASS),SPACE=(TRK,(1,1),RLSE)   
/*                                                           


Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12376
Topics: 75
Location: San Jose

PostPosted: Wed Jun 09, 2004 5:59 am    Post subject: Reply with quote

Rahul,

Dude not every one works in your timezone. My day just started it is just 7:00 Am here. Frank will not be here for another 4 hours as he is on PST

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Rahull
Beginner


Joined: 29 Jan 2004
Posts: 62
Topics: 19

PostPosted: Wed Jun 09, 2004 6:12 am    Post subject: Reply with quote

Acutally I was preety in need of that and I thought that its pretty difficult to acheive the above problem using SORT.

Thank you Kolusu... But I beg your pardon I dont understand anything what you have given in the control card.

I need to explain it to my client. Could you please explain what we are doing in the control card
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12376
Topics: 75
Location: San Jose

PostPosted: Wed Jun 09, 2004 7:40 am    Post subject: Reply with quote

Rahul,

The job is pretty simple. The first copy operator takes in the input file just counts the no: of records in the input and writes out the count to a temp file T1.
Code:

//CTL1CNTL  DD *                                   
  OUTFIL FNAMES=T1,NODETAIL,TRAILER1=(COUNT)       


The output file name is T1. NODETAIL parm means do not write any of the input records to the output file. The count parm on the trailer1 parm will write out total no: of records in the input file. The parm count will be a 8 byte field with leading zeroes suppressed.

let us say your input file has 27 records, then T1 file will be as follows

Code:

---+----1----+----2---
      27             


Now we take this count file(t1) and create the dynamic control cards.
Code:

//CTL2CNTL  DD *                                   
  INREC FIELDS=(1,9,FS,DIV,+5,EDIT=(TTTTTTTT))     
  OUTFIL FNAMES=CTL3CNTL,                           
  OUTREC=(C' OUTFIL FNAMES=OUT1,ENDREC=',1,8,/,               
          C' OUTFIL FNAMES=OUT2,STARTREC=',                   
          +1,ADD,1,8,ZD,EDIT=(TTTTTTTT),C',ENDREC=',         
          +2,MUL,1,8,ZD,EDIT=(TTTTTTTT),/,                   
          C' OUTFIL FNAMES=OUT3,STARTREC=',                   
          +1,ADD,(+2,MUL,1,8,ZD),EDIT=(TTTTTTTT),C',ENDREC=',
          +3,MUL,1,8,ZD,EDIT=(TTTTTTTT),/,                   
          C' OUTFIL FNAMES=OUT4,STARTREC=',                   
          +1,ADD,(+3,MUL,1,8,ZD),EDIT=(TTTTTTTT),C',ENDREC=',
          +4,MUL,1,8,ZD,EDIT=(TTTTTTTT),/,                   
          C' OUTFIL FNAMES=OUT5,STARTREC=',                   
          +1,ADD,(+4,MUL,1,8,ZD),EDIT=(TTTTTTTT),80:X)       


using Inrec fields we first divide the total count by 5 and taking the quotient.

So 27/5 = 5 (ignoring the remainder)

I used the edit mask(EDIT=(TTTTTTTT)) to have the leading zeroes. so the value after inrec processing looks like this

Code:

---+----1----+----2---
0000005


Usually to split the file we use the startrec and endrec parms. So using outrec we create the startrec and endrec for all the output files.

Since the total no: of records is 27 , the first 4 files have 5 records each and the last file will have the rest 7 records.

Normally you would code like this

Code:

 OUTFIL FNAMES=OUT1,ENDREC=00000005                   
 OUTFIL FNAMES=OUT2,STARTREC=00000006,ENDREC=00000010
 OUTFIL FNAMES=OUT3,STARTREC=00000011,ENDREC=00000015
 OUTFIL FNAMES=OUT4,STARTREC=00000016,ENDREC=00000020
 OUTFIL FNAMES=OUT5,STARTREC=00000021                 


we are doing the same thing in CTL2. I am generating the control cards as shown above. Since we already have the quotient, it is just using couple of arthimetic operations on the quotient.

For the first file we simply supply the quotient for the endrec parm, as by default the startrec is always 1

The parm '/' is used to write the record as a new line

For the second file we need to add 1 to the quotient for the startrec(5+1) parm and multiply the quotient by 2 for the endrec(2*5) parm.

For the third file we need to add 1 to the product after multiplying the quotient by 2 for the startrec(1+(2*5)) parm and and multiply the quotient by 3 for the endrec(3*5) parm.

For the fourth file we need to add 1 to the product after multiplying the quotient by 3 for the startrec(1+(3*5)) parm and and multiply the quotient by 4 for the endrec(3*5) parm.

For the last file we need to add 1 to the product after multiplying the quotient by 4. we don't need to specify the endrec parm as we want rest of the records in the last file.

So the CTL3CNTL will look as follows:
Code:

 OUTFIL FNAMES=OUT1,ENDREC=00000005                   
 OUTFIL FNAMES=OUT2,STARTREC=00000006,ENDREC=00000010
 OUTFIL FNAMES=OUT3,STARTREC=00000011,ENDREC=00000015
 OUTFIL FNAMES=OUT4,STARTREC=00000016,ENDREC=00000020
 OUTFIL FNAMES=OUT5,STARTREC=00000021                 


Now using this control card we just copy the records from the input file to the output file.

Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Wed Jun 09, 2004 9:15 am    Post subject: Reply with quote

Kolusu,

The DFSORT R14 PTF is UQ90053 (Feb, 2003).

Your job works, but there's an unintentional "trick" in it, that judging from your explanation, you're not aware of.

Because you didn't use REMOVECC, your COUNT value will look like this for 200 records:

Code:

1bbbbb200


b is for a blank. The 1 is the carriage control character - it's followed by the 8 byte count. (COUNT gives an 8 byte count with leading zeros suppressed.) Since you're using 1,9,FS, the 1 will be ignored as long as it's followed by a blank. If there were 20000000 records, the count record would have 120000000 and be misinterpreted. You can fix this either by using REMOVECC and using 1,8,FS, or by using 2,8,FS to skip the carriage control character.
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Phantom
Data Mgmt Moderator
Data Mgmt Moderator


Joined: 07 Jan 2003
Posts: 1056
Topics: 91
Location: The Blue Planet

PostPosted: Wed Jun 09, 2004 9:20 am    Post subject: Reply with quote

Kolusu,

Just a clarification.

Quote:

The parm count will be a 9 byte field with leading zeroes suppressed.


I was under the impression that COUNT will output a 8 digit count value. Please confirm. We have Synscort 1999 version in our shop and when I used the COUNT parm I get a 8 digit value.

Thanks,
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12376
Topics: 75
Location: San Jose

PostPosted: Wed Jun 09, 2004 9:36 am    Post subject: Reply with quote

Frank,

Thanks for pointing out the error. As it was early in the morning , I just wrote it without testing. I am going to edit the post to add the removecc parm and adjust the fields.

Phantom : You are right about the count field being 8 bytes. Thanks for pointing out. I am editing the posts to reflect the change.

Thanks

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ram22
Beginner


Joined: 09 Jun 2004
Posts: 33
Topics: 6

PostPosted: Fri Jun 11, 2004 9:11 am    Post subject: use of syncsort instead of icetool/synctool Reply with quote

Hey frank/kosula.. I need to split the file using only syncsort... as per the clients standards, we should not use icetool/synctool even these are products of the same.. is there any way doing this????
Back to top
View user's profile Send private message
Ram22
Beginner


Joined: 09 Jun 2004
Posts: 33
Topics: 6

PostPosted: Fri Jun 11, 2004 9:13 am    Post subject: Reply with quote

I am waiting for u r reply
Back to top
View user's profile Send private message
Ram22
Beginner


Joined: 09 Jun 2004
Posts: 33
Topics: 6

PostPosted: Fri Jun 11, 2004 9:22 am    Post subject: Reply with quote

no reply Smile
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12376
Topics: 75
Location: San Jose

PostPosted: Fri Jun 11, 2004 9:22 am    Post subject: Reply with quote

Ram22,

You can splitt the job into 3 steps to achieve the desired results.

Code:

//STEP0100 EXEC PGM=SORT                               
//SYSOUT   DD SYSOUT=*                                   
//SORTIN   DD DSN=YOUR INPUT FILE,
//             DISP=SHR             
//SORTOUT  DD DSN=&T1,DISP=(,PASS),SPACE=(TRK,(1,1),RLSE)
//SYSIN    DD *
  SORT FIELDS=COPY
  OUTFIL NODETAIL,REMOVECC,TRAILER1=(COUNT)       
//*
//STEP0200 EXEC PGM=SORT                               
//SYSOUT   DD SYSOUT=*                                   
//SORTIN   DD DSN=&T1,
//            DISP=OLD             
//SORTOUT  DD DSN=&C1,DISP=(,PASS),SPACE=(TRK,(1,1),RLSE)
//SYSIN    DD *
  SORT FIELDS=COPY                                             
  INREC FIELDS=(1,8,FS,DIV,+5,EDIT=(TTTTTTTT))                 
  OUTFIL OUTREC=(C' OUTFIL FNAMES=OUT1,ENDREC=',1,8,/,         
         C' OUTFIL FNAMES=OUT2,STARTREC=',                     
         +1,ADD,1,8,ZD,EDIT=(TTTTTTTT),C',ENDREC=',           
         +2,MUL,1,8,ZD,EDIT=(TTTTTTTT),/,                     
         C' OUTFIL FNAMES=OUT3,STARTREC=',                     
         +1,ADD,(+2,MUL,1,8,ZD),EDIT=(TTTTTTTT),C',ENDREC=',   
         +3,MUL,1,8,ZD,EDIT=(TTTTTTTT),/,                     
         C' OUTFIL FNAMES=OUT4,STARTREC=',                     
         +1,ADD,(+3,MUL,1,8,ZD),EDIT=(TTTTTTTT),C',ENDREC=',   
         +4,MUL,1,8,ZD,EDIT=(TTTTTTTT),/,                     
         C' OUTFIL FNAMES=OUT5,STARTREC=',                     
         +1,ADD,(+4,MUL,1,8,ZD),EDIT=(TTTTTTTT),80:X)         
//*
//STEP0300 EXEC PGM=SORT                               
//SYSOUT   DD SYSOUT=*                                   
//SORTIN   DD DSN=YOUR INPUT FILE,
//            DISP=SHR             
//OUT1     DD DSN=YOUR OUTPUT FILE 1,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(X,Y,)RLSE)
//OUT2     DD DSN=YOUR OUTPUT FILE 2,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(X,Y,)RLSE)
//OUT3     DD DSN=YOUR OUTPUT FILE 3,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(X,Y,)RLSE)
//OUT4     DD DSN=YOUR OUTPUT FILE 4,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(X,Y,)RLSE)
//OUT5     DD DSN=YOUR OUTPUT FILE 5,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(X,Y,)RLSE)
//SYSIN    DD DSN=&C1,
//            DISP=OLD
//*



Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12376
Topics: 75
Location: San Jose

PostPosted: Fri Jun 11, 2004 9:26 am    Post subject: Reply with quote

Ram22,

The day you pay for the service you receive on this website, you are eligible to DEMAND the solution. Till then be patient.

I also work for someone else for my lively hood. This is only part time job. I am not here to answer your question the moment you posted Mad

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Fri Jun 11, 2004 9:46 am    Post subject: Reply with quote

Ram22 wrote
Quote:
Hey frank/kosula.. I need to split the file using only syncsort... as per the clients standards, we should not use icetool/synctool even these are products of the same.. is there any way doing this????


Wow, this must be a new record for the most annoying short post.

It's Kolusu, not kosula!

I'm a DFSORT developer. DFSORT and Syncsort are competitive products. While I'm happy to answer questions on DFSORT/ICETOOL/ICEGENER, please don't expect me to answer questions on Syncsort.

ICETOOL and SYNCTOOL are NOT the same product. ICETOOL is a fully supported, fully documented feature of DFSORT. SYNCTOOL is undocumented, unsupported code in Syncsort.

Please make an effort to be less annoying in your posts.

As Kolusu said, this is a free board where people try to help each other as volunteers. Nobody is obligated to answer your questions at all, let only instantly.
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group