MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Splitting huge file into mutiple files and loading into exce

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
Nabanita Das
Beginner


Joined: 09 Nov 2005
Posts: 33
Topics: 10

PostPosted: Tue Jan 10, 2006 8:02 am    Post subject: Splitting huge file into mutiple files and loading into exce Reply with quote

Hi,

Can you please suggest a way to split data into different files when the no of records exceeds 65,536. MY purpose is to ultimately upload them in excel sheets. Since excel sheet has a limitation of only 65,536 rows and 256 columns when we have a huge file we face problem in uploading the data. So I need to create a tool to split the files whenever number of records exceed 65,536. Please suggest a solution using sort.
Back to top
View user's profile Send private message  
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12376
Topics: 75
Location: San Jose

PostPosted: Tue Jan 10, 2006 8:06 am    Post subject: Reply with quote

Nabanita Das,

Please search before posting. Check this link

http://mvsforums.com/helpboards/viewtopic.php?t=12&highlight=split

Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website  
Nabanita Das
Beginner


Joined: 09 Nov 2005
Posts: 33
Topics: 10

PostPosted: Wed Jan 18, 2006 8:06 am    Post subject: Reply with quote

Thanks for the reply Kolusu.
I may not have expressed my requirement properly in the first post. My requirement is to decide the number of Splits i.e. output file numbers and splitting the input file into that many files when the no of records exceeds 65,536.

ProbablyI can take care of determining the number of split required through my COBOL program which is actually creating the input file to be used for split.

Please guide me how I can pass that number to SORT step? Can we dynamically create the files depending on the number (no of files to be created) I am passing to the sort step?

So to be precise I will be passing the no of split required through a file to the sort step and sort should be able provide me that many splits(output files each having 65,536 records at the max)

Thanks and Regards,
Nabanita
Back to top
View user's profile Send private message  
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12376
Topics: 75
Location: San Jose

PostPosted: Wed Jan 18, 2006 8:43 am    Post subject: Reply with quote

Quote:

ProbablyI can take care of determining the number of split required through my COBOL program which is actually creating the input file to be used for split.


Why not split the files in the cobol program itself? for every 65,536 records create a new file.

You can use dynamic allocation (BPXWDYN) in cobol to dynamically allocate the datasets.

Search for BPXWDYN in the application programming forum and you will find many examples

Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website  
German Castillo
Beginner


Joined: 23 Dec 2005
Posts: 83
Topics: 2
Location: Caracas, Venezuela

PostPosted: Wed Jan 18, 2006 9:32 am    Post subject: Reply with quote

Hello Nabanita,

It is more elaborated, but, You can also

a) SORT your input file to determine the number of records
b) Read that record count from (maybe) a rexx script, in order to produce two files. Fisrt file holding the SPLITBY sort statements while the Second file holds the DDnames for those files. The number of files is easy determined by doing something like COUNT / 65536 + COUNT // 65536. Check out if there are more than 255 DDNAMES generated here (JCL ddnames file amount limit/step).
c) Make (and submit) a dynamic JCL using those two files as part of the splitting sort step and maybe Xfer/FTPing the resulting file(s)
_________________
Best wishes,

German Castillo
Back to top
View user's profile Send private message  
Alain Benveniste
Beginner


Joined: 04 May 2003
Posts: 92
Topics: 4
Location: Paris, France

PostPosted: Wed Jan 18, 2006 10:29 am    Post subject: Reply with quote

Don't you know how many records you have approximately, then SPLITBY Dfsort feature could also be an option.

Alain
Back to top
View user's profile Send private message  
Nabanita Das
Beginner


Joined: 09 Nov 2005
Posts: 33
Topics: 10

PostPosted: Fri Jan 20, 2006 8:07 am    Post subject: Reply with quote

Hi German Castillo,

Would you please explain elaborately the steps you have suggested?
I am not familiar with REXX. Can sort be again used as an alternative in this case ?
Nabanita Das
Back to top
View user's profile Send private message  
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12376
Topics: 75
Location: San Jose

PostPosted: Fri Jan 20, 2006 10:14 am    Post subject: Reply with quote

Nabanita Das,

The following DFSORT/ICETOOL JCl will give you the desired results. If you have syncsort at your shop then use synctool as pgm in step0100.The first step reads the input dataset and creates dynamic control cards and also output files. The step0200 then submits another job to intrdr which actually splits the file into 'n' number of output files. You need to put your tid or the highlevel qualifier in place of USERID

Code:

//STEP0100 EXEC PGM=ICETOOL                               
//TOOLMSG  DD SYSOUT=*                                   
//DFSMSG   DD SYSOUT=*                                   
//IN       DD DSN=YOUR INPUT FILE,
//            DISP=SHR
//T1       DD DSN=&T1,DISP=(,PASS),SPACE=(CYL,(1,1),RLSE)
//T2       DD DSN=&T2,DISP=(,PASS),SPACE=(CYL,(1,1),RLSE)
//C1       DD DSN=&C1,DISP=(,PASS),SPACE=(CYL,(1,1),RLSE)
//D1       DD DSN=&D1,DISP=(,PASS),SPACE=(CYL,(1,1),RLSE)
//TOOLIN   DD *                                             
  COPY FROM(IN)  USING(CTL1)                               
  COPY FROM(T1)  USING(CTL2)
/*
//CTL1CNTL DD  *                                                 
  INREC FIELDS=(SEQNUM,8,ZD)                                     
  OUTREC FIELDS=(1,8,X,                                         
                 1,8,ZD,DIV,+65536,EDIT=(TTTTTTTT),80:X)         
  OUTFIL FNAMES=T1,                                             
  OUTREC=(+1,ADD,(+65536,MUL,(1,8,ZD,SUB,+1)),EDIT=(TTTTTTTT),X,
          +65536,MUL,1,8,ZD,EDIT=(TTTTTTTT))                     
  OUTFIL FNAMES=T2,                                             
  REMOVECC,NODETAIL,                                             
  TRAILER1=(' OPTION STOPAFT=',10,8,80:X)
/*
//CTL2CNTL DD DSN=&T2,DISP=OLD,VOL=REF=*.T2                 
//         DD *                                             
  OUTFIL FNAMES=C1,REMOVECC,                               
  OUTREC=(C'  OUTFIL FNAMES=OUT',SEQNUM,5,ZD,               
          C',STARTREC=',1,8,C',ENDREC=',10,8,80:X),         
  TRAILER1=('  OUTFIL FNAMES=LAST,SAVE',/,                 
            '/*',80:X)                                     
  OUTFIL FNAMES=D1,REMOVECC,                               
  OUTREC=(C'//OUT',SEQNUM,5,ZD,C' DD DSN=USERID.OUT',     
          SEQNUM,5,ZD,C',',/,                               
          C'//',12X,C'DISP=(NEW,CATLG,DELETE),',/,         
          C'//',12X,C'UNIT=SYSDA,',/,                       
          C'//',12X,C'SPACE=(CYL,(2,2),RLSE)',/,           
          C'/*',80:X),                                     
  TRAILER1=('//LAST    ',C' DD DSN=USERID.LAST',C',',/,   
          C'//',12X,C'DISP=(NEW,CATLG,DELETE),',/,         
          C'//',12X,C'UNIT=SYSDA,',/,                       
          C'//',12X,C'SPACE=(CYL,(2,2),RLSE)',/,           
          C'/*',80:X)                                       
/*                                                         
//STEP0200 EXEC  PGM=SORT               
//SYSOUT   DD SYSOUT=*                   
//SYSIN    DD *                         
  SORT FIELDS=COPY                       
//SORTOUT  DD SYSOUT=*                   
//SORTIN   DD DATA,DLM=$$               
//USERIDA  JOB 'SPLIT COPY',             
//             CLASS=A,                 
//             MSGCLASS=Y,               
//             MSGLEVEL=(1,1),           
//             NOTIFY=&SYSUID           
//*                                     
//STEP0100 EXEC  PGM=SORT               
//SYSOUT   DD SYSOUT=*                   
//SORTIN   DD DSN=YOUR INPUT FILE,       
//            DISP=SHR                   
//SYSIN    DD *                         
  SORT FIELDS=COPY                       
$$                                       
//         DD DSN=&C1,DISP=(OLD,PASS)   
//         DD DSN=&D1,DISP=(OLD,PASS)   
//*                                     


Now check your SORTOUT from step0200 and make sure that the generated JCL looks right. If you are satisfied that the jcl is ok then code the INTRDR statement.

i.e change this statement in step0200 from
Code:
 
//SORTOUT  DD SYSOUT=*



to

Code:

//SORTOUT  DD SYSOUT=(*,INTRDR)       



Hope this helps..

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website  
German Castillo
Beginner


Joined: 23 Dec 2005
Posts: 83
Topics: 2
Location: Caracas, Venezuela

PostPosted: Fri Jan 20, 2006 1:10 pm    Post subject: Reply with quote

Nabanita,

Also as an alternative you use this Rexx, call it for example MYSPLIT and place it in a permanent dataset:

Code:

    "EXECIO * DISKR COUNT (FINIS"                                       
    PULL MY_COUNT                                                       
    MY_FILES = MY_COUNT % 65536                                         
    MY_REMAINDER = MY_COUNT // 65536                                     
    IF MY_REMAINDER > 0 THEN MY_FILES = MY_FILES + 1                     
                                                                         
    J = 0                                                               
    DO I = 1 TO MY_FILES                                                 
       MY_SORTLINE = COPIES(' ', 80)                                     
       SELECT                                                           
         WHEN I = 1 THEN ,                                               
             MY_SORTLINE = "    OUTFIL FNAMES=(OUT"I                     
         WHEN I > 1 THEN ,                                               
             MY_SORTLINE = "                   OUT"I                     
       END                                                               
       IF I < MY_FILES THEN MY_SORTLINE = MY_SORTLINE||","               
       ELSE                 MY_SORTLINE = MY_SORTLINE||"),SPLITBY=65536"
       SORT.I = MY_SORTLINE                                             
       J = J + 1                                                         
       DDNAMES.J = "//OUT"I"   DD DSN=MYOUT"I",DISP=(,CATLG,DELETE),"   
       J = J + 1                                                         
       DDNAMES.J = "         SPACE=(CYL,(5,5)),UNIT=SYSDA"               
    END                                                                 
    SORT.0 = I - 1                                                       
    DDNAMES.0 = J                                                       
    QUEUE "//SPLITJOB  JOB (ACCOOUNT),CLASS=A,MSGCLASS=X,MSGLEVEL=(1,1)"
    QUEUE "//SPLITSTEP  EXEC PGM=ICEMAN"                                 
    QUEUE "//SYSOUT   DD SYSOUT=*"                                       
    DO I = 1 TO DDNAMES.0                                               
        QUEUE DDNAMES.I                                                 
    END                                                                 
    QUEUE "//SORTIN DD *"                                               
    QUEUE "    SORT FIELDS=COPY"                                         
    DO I = 1 TO SORT.0                                                   
        QUEUE SORT.I                                                     
    END                                                                 
   "EXECIO * DISKW JCL (FINIS"


Then, submit this JCL:

Code:

//COUNT    EXEC PGM=ICEMAN                                           
//SORTIN   DD DISP=(SHR,PASS),DSN=&&EXEC(TEST1)                       
//SORTOUT  DD SYSOUT=*                                               
//SYSIN    DD *                                                       
   SORT FIELDS=COPY                                                   
   OUTFIL FNAMES=T1,NODETAIL,REMOVECC,TRAILER1=(COUNT)               
//SYSOUT   DD SYSOUT=*                                               
//SYSPRINT DD SYSOUT=*                                               
//T1       DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(1,1)),DISP=(,PASS)     
//REXXBAT1 EXEC PGM=IRXJCL,                                           
//  PARM='MYSPLIT'                                                   
//SYSEXEC  DD DISP=(SHR,PASS),DSN=&&EXEC                             
//SYSTSIN  DD DUMMY                                                   
//COUNT    DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(1,1)),DISP=(OLD,DELETE)
//JCL      DD SYSOUT=*                                               
//*JCL      DD SYSOUT=(*,INTRDR)                                     
//SYSIN    DD DUMMY                                                   
//SYSTSPRT DD SYSOUT=*                                               


You should get the same result, Kolosus' suggestion is still valid, check the output at the JCL DD and change it to point to the internal reader.
_________________
Best wishes,

German Castillo
Back to top
View user's profile Send private message  
Nabanita Das
Beginner


Joined: 09 Nov 2005
Posts: 33
Topics: 10

PostPosted: Mon Jan 23, 2006 7:42 am    Post subject: Reply with quote

I tried out Kolusu's solution that works out great !
Kolusu,

I will try out creating these files using BPXWDYN also.
Thanks a lot!!!!!!!

Now I am going to try German Castillo's solution.

German Castillo,

Since I am a REXX illiterate I could not understand where to mention my input dataset. Please guide me how to use the REXX code provided by you. Meanwhile I am using this opportunity to learn REXX.

Thanks and Regards,
Nabanita Das
Back to top
View user's profile Send private message  
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12376
Topics: 75
Location: San Jose

PostPosted: Mon Jan 23, 2006 8:30 am    Post subject: Reply with quote

Quote:

Since I am a REXX illiterate I could not understand where to mention my input dataset. Please guide me how to use the REXX code provided by you. Meanwhile I am using this opportunity to learn REXX.

Nabanita Das,

Please go thru this excellent FAQ and all your questions on rexx are answered here

http://mvsforums.com/helpboards/viewforum.php?f=24

Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website  
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Tue May 23, 2006 12:11 pm    Post subject: Reply with quote

Here's a new way to do this kind of thing with DFSORT/ICETOOL using the SPLIT1R=n parameter available with z/OS DFSORT V1R5 PTF UK90007 or DFSORT R14 PTF UK90006 (April, 2006). I've used DFSORT Symbols to make it easy to plug in the variable information needed as indicated. You'll also need to plug in the input data set name (two places) and your jobcard statements where indicated.

Code:

//S1 EXEC PGM=ICETOOL
//** Dynamically splits LIMCT records contiguously among
//** the needed number of output data sets.
//** Creates OUTFIL statement like this:
//**   OUTFIL SPLIT1R=LIMCT,
//**     FNAMES=(OUT001,
//**       OUT002,
//**       ...
//**       OUTnnn)
//** Creates output DD statements like this:
//** //OUTnnn DD DSN=OUTDSNnnn,
//** // OUTPARMS
//SYMNAMES DD *
* Set LIMCT to +n where n is the number of records you
* want in each data set.
LIMCT,+65536
* Set OUTDSN to 'string' where string is your
* output data set name.  nnn (seqnum) will be
* added at the end of OUTDSN.
OUTDSN,'userid.OUT'
* Set OUTPARMS to the UNIT, SPACE and DISP parameters you
* want for each output data set.
OUTPARMS,'UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)'
//TOOLMSG  DD SYSOUT=*
//DFSMSG   DD SYSOUT=*
//*** Insert your input dsn information for the IN DD
//IN       DD DSN=...  input file
//CX       DD DSN=&CX,DISP=(,PASS),SPACE=(TRK,(1,1))
//CTL3CNTL DD DSN=&C3,DISP=(,PASS),SPACE=(TRK,(1,1))
//T1       DD DSN=&T1,DISP=(,PASS),SPACE=(TRK,(1,1))
//C1       DD DSN=&C1,DISP=(,PASS),SPACE=(TRK,(1,1))
//D1       DD DSN=&D1,DISP=(,PASS),SPACE=(TRK,(1,1))
//TOOLIN   DD *
  COPY FROM(IN)  USING(CTL1)
  COPY FROM(CX)  USING(CTL2)
  COPY FROM(CX)  USING(CTL3)
  COPY FROM(T1)  USING(CTL4)
/*
//CTL1CNTL DD *
  OUTFIL FNAMES=CX,NODETAIL,REMOVECC,
    TRAILER1=(COUNT=(TO=ZD,LENGTH=8))
//CTL2CNTL DD *
  INREC BUILD=((1,8,ZD,ADD,(LIMCT,SUB,+1)),DIV,LIMCT,
     TO=ZD,LENGTH=3)
  OUTFIL FNAMES=CTL3CNTL,
    BUILD=(C' OUTFIL FNAMES=T1,REPEAT=',1,3,C',',/,
     C'  BUILD=(SEQNUM,3,ZD,C''',1,3,C'''',
     C',LIMCT,TO=ZD,LENGTH=8)',80:X)
//CTL4CNTL DD  *
//CTL4CNTL DD  *
 OUTFIL FNAMES=C1,REMOVECC,
   HEADER1=(C'  OUTFIL SPLIT1R=',7,8,C','),
   IFTHEN=(WHEN=INIT,BUILD=(1,6,13:C'OUT',1,3,C',',80:X)),
   IFTHEN=(WHEN=(1,3,ZD,EQ,+1),OVERLAY=(1:4X,5:C'FNAMES=(')),
   IFTHEN=(WHEN=(1,3,ZD,EQ,4,3,ZD),OVERLAY=(1:6X,19:C')')),
   IFTHEN=(WHEN=NONE,OVERLAY=(1:6X))
 OUTFIL FNAMES=D1,REMOVECC,
   BUILD=(C'//OUT',1,3,C' DD DSN=',OUTDSN,1,3,C',',/,
          C'// ',OUTPARMS,80:X)
/*
//*
//S2 EXEC  PGM=SORT
//SYSOUT   DD SYSOUT=*
//SYSIN    DD *
  SORT FIELDS=COPY
//SORTOUT DD SYSOUT=(A,INTRDR)
//*** Insert your job information for the USERIDA JOB card
//SORTIN DD DATA,DLM=$$
//USERIDA  JOB ...
//SPLIT EXEC PGM=ICEMAN
//SYSOUT   DD SYSOUT=*
//*** Insert your input dsn information for the SORTIN DD
//SORTIN   DD DSN=...  input file
//SYSIN    DD *
  SORT FIELDS=COPY
$$
//         DD DSN=&C1,DISP=(OLD,PASS)
//         DD DSN=&D1,DISP=(OLD,PASS)
//*

_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website  
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group