Posted: Tue Jan 10, 2006 8:02 am Post subject: Splitting huge file into mutiple files and loading into exce
Hi,
Can you please suggest a way to split data into different files when the no of records exceeds 65,536. MY purpose is to ultimately upload them in excel sheets. Since excel sheet has a limitation of only 65,536 rows and 256 columns when we have a huge file we face problem in uploading the data. So I need to create a tool to split the files whenever number of records exceed 65,536. Please suggest a solution using sort.
Thanks for the reply Kolusu.
I may not have expressed my requirement properly in the first post. My requirement is to decide the number of Splits i.e. output file numbers and splitting the input file into that many files when the no of records exceeds 65,536.
ProbablyI can take care of determining the number of split required through my COBOL program which is actually creating the input file to be used for split.
Please guide me how I can pass that number to SORT step? Can we dynamically create the files depending on the number (no of files to be created) I am passing to the sort step?
So to be precise I will be passing the no of split required through a file to the sort step and sort should be able provide me that many splits(output files each having 65,536 records at the max)
Joined: 26 Nov 2002 Posts: 12376 Topics: 75 Location: San Jose
Posted: Wed Jan 18, 2006 8:43 am Post subject:
Quote:
ProbablyI can take care of determining the number of split required through my COBOL program which is actually creating the input file to be used for split.
Why not split the files in the cobol program itself? for every 65,536 records create a new file.
You can use dynamic allocation (BPXWDYN) in cobol to dynamically allocate the datasets.
Search for BPXWDYN in the application programming forum and you will find many examples
a) SORT your input file to determine the number of records
b) Read that record count from (maybe) a rexx script, in order to produce two files. Fisrt file holding the SPLITBY sort statements while the Second file holds the DDnames for those files. The number of files is easy determined by doing something like COUNT / 65536 + COUNT // 65536. Check out if there are more than 255 DDNAMES generated here (JCL ddnames file amount limit/step).
c) Make (and submit) a dynamic JCL using those two files as part of the splitting sort step and maybe Xfer/FTPing the resulting file(s) _________________ Best wishes,
Would you please explain elaborately the steps you have suggested?
I am not familiar with REXX. Can sort be again used as an alternative in this case ?
Nabanita Das
Joined: 26 Nov 2002 Posts: 12376 Topics: 75 Location: San Jose
Posted: Fri Jan 20, 2006 10:14 am Post subject:
Nabanita Das,
The following DFSORT/ICETOOL JCl will give you the desired results. If you have syncsort at your shop then use synctool as pgm in step0100.The first step reads the input dataset and creates dynamic control cards and also output files. The step0200 then submits another job to intrdr which actually splits the file into 'n' number of output files. You need to put your tid or the highlevel qualifier in place of USERID
Now check your SORTOUT from step0200 and make sure that the generated JCL looks right. If you are satisfied that the jcl is ok then code the INTRDR statement.
You should get the same result, Kolosus' suggestion is still valid, check the output at the JCL DD and change it to point to the internal reader. _________________ Best wishes,
I tried out Kolusu's solution that works out great !
Kolusu,
I will try out creating these files using BPXWDYN also.
Thanks a lot!!!!!!!
Now I am going to try German Castillo's solution.
German Castillo,
Since I am a REXX illiterate I could not understand where to mention my input dataset. Please guide me how to use the REXX code provided by you. Meanwhile I am using this opportunity to learn REXX.
Joined: 26 Nov 2002 Posts: 12376 Topics: 75 Location: San Jose
Posted: Mon Jan 23, 2006 8:30 am Post subject:
Quote:
Since I am a REXX illiterate I could not understand where to mention my input dataset. Please guide me how to use the REXX code provided by you. Meanwhile I am using this opportunity to learn REXX.
Nabanita Das,
Please go thru this excellent FAQ and all your questions on rexx are answered here
Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
Posted: Tue May 23, 2006 12:11 pm Post subject:
Here's a new way to do this kind of thing with DFSORT/ICETOOL using the SPLIT1R=n parameter available with z/OS DFSORT V1R5 PTF UK90007 or DFSORT R14 PTF UK90006 (April, 2006). I've used DFSORT Symbols to make it easy to plug in the variable information needed as indicated. You'll also need to plug in the input data set name (two places) and your jobcard statements where indicated.
Code:
//S1 EXEC PGM=ICETOOL
//** Dynamically splits LIMCT records contiguously among
//** the needed number of output data sets.
//** Creates OUTFIL statement like this:
//** OUTFIL SPLIT1R=LIMCT,
//** FNAMES=(OUT001,
//** OUT002,
//** ...
//** OUTnnn)
//** Creates output DD statements like this:
//** //OUTnnn DD DSN=OUTDSNnnn,
//** // OUTPARMS
//SYMNAMES DD *
* Set LIMCT to +n where n is the number of records you
* want in each data set.
LIMCT,+65536
* Set OUTDSN to 'string' where string is your
* output data set name. nnn (seqnum) will be
* added at the end of OUTDSN.
OUTDSN,'userid.OUT'
* Set OUTPARMS to the UNIT, SPACE and DISP parameters you
* want for each output data set.
OUTPARMS,'UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)'
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//*** Insert your input dsn information for the IN DD
//IN DD DSN=... input file
//CX DD DSN=&CX,DISP=(,PASS),SPACE=(TRK,(1,1))
//CTL3CNTL DD DSN=&C3,DISP=(,PASS),SPACE=(TRK,(1,1))
//T1 DD DSN=&T1,DISP=(,PASS),SPACE=(TRK,(1,1))
//C1 DD DSN=&C1,DISP=(,PASS),SPACE=(TRK,(1,1))
//D1 DD DSN=&D1,DISP=(,PASS),SPACE=(TRK,(1,1))
//TOOLIN DD *
COPY FROM(IN) USING(CTL1)
COPY FROM(CX) USING(CTL2)
COPY FROM(CX) USING(CTL3)
COPY FROM(T1) USING(CTL4)
/*
//CTL1CNTL DD *
OUTFIL FNAMES=CX,NODETAIL,REMOVECC,
TRAILER1=(COUNT=(TO=ZD,LENGTH=8))
//CTL2CNTL DD *
INREC BUILD=((1,8,ZD,ADD,(LIMCT,SUB,+1)),DIV,LIMCT,
TO=ZD,LENGTH=3)
OUTFIL FNAMES=CTL3CNTL,
BUILD=(C' OUTFIL FNAMES=T1,REPEAT=',1,3,C',',/,
C' BUILD=(SEQNUM,3,ZD,C''',1,3,C'''',
C',LIMCT,TO=ZD,LENGTH=8)',80:X)
//CTL4CNTL DD *
//CTL4CNTL DD *
OUTFIL FNAMES=C1,REMOVECC,
HEADER1=(C' OUTFIL SPLIT1R=',7,8,C','),
IFTHEN=(WHEN=INIT,BUILD=(1,6,13:C'OUT',1,3,C',',80:X)),
IFTHEN=(WHEN=(1,3,ZD,EQ,+1),OVERLAY=(1:4X,5:C'FNAMES=(')),
IFTHEN=(WHEN=(1,3,ZD,EQ,4,3,ZD),OVERLAY=(1:6X,19:C')')),
IFTHEN=(WHEN=NONE,OVERLAY=(1:6X))
OUTFIL FNAMES=D1,REMOVECC,
BUILD=(C'//OUT',1,3,C' DD DSN=',OUTDSN,1,3,C',',/,
C'// ',OUTPARMS,80:X)
/*
//*
//S2 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SYSIN DD *
SORT FIELDS=COPY
//SORTOUT DD SYSOUT=(A,INTRDR)
//*** Insert your job information for the USERIDA JOB card
//SORTIN DD DATA,DLM=$$
//USERIDA JOB ...
//SPLIT EXEC PGM=ICEMAN
//SYSOUT DD SYSOUT=*
//*** Insert your input dsn information for the SORTIN DD
//SORTIN DD DSN=... input file
//SYSIN DD *
SORT FIELDS=COPY
$$
// DD DSN=&C1,DISP=(OLD,PASS)
// DD DSN=&D1,DISP=(OLD,PASS)
//*
_________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum