MVSFORUMS.com

k2 · Beginner Joined: 21 Jan 2004 Posts: 7 Topics: 3

I have a job that I currently run in SAS that opens about 300 flat mainframe files (FB, lrecl 700 to 1000), each having up to 500K records. It doesn't actually read any fields, but just loops through the file so that I can get a record count. That's all I need to know...the exact number of records in each file.

This approach does work, but is quite slow. I've got to think there's a quicker way, but after many attempts, I have yet to figure out a more efficient method. The job runs overnight, so memory usage is not really the issue; it's more the time constraints on behalf of the other jobs waiting for this information.

JFCB and other control blocks seem to only tell me space requested, not used, and I need an exact count.

DFSORT does give promising results, spitting out the record count into a file (using OUTFIL with a TRAILER) but trying to run that 300 times and get everything fed into one data set is not working so well.

Any ideas on a way to speed up this process? REXX? ICEMAN?

Thanks in advance for any assistance,
K2

Maton_Man · Beginner Joined: 30 Jan 2004 Posts: 123 Topics: 0

With the number of files you are reading and the number of records there are in some of the files I was attracted towards developing a solution which involves a measure of parallelism.

I developed a controlling job which :-

1. Reads in a list of files
2. Executes a rexx which FTINCLs a JCL skeleton* for every file in the list
3. Submit the temporary dataset created by the file tailoring process which contains one job for every file in the list, ie in your case there would be 300 jobs in the dataset.

I pass a parameter to the controlling job which generates all the other jobs with a jobname that is controlled by means of a counter. Whatever the value of the counter is, that is the number of different jobnames that will get generated, ie if I pass 6 then the first job generated will be called COUNT1, the next COUNT2 etc...

That way, when they all get submitted you will not flood the system with 300 jobs, rather you will have 300 jobs queued under 6 different jobnames with an even spread across all names.

The job that is generated has two steps.

Step 1 uses ICETOOL to count the records

Step 2 uses SORT to read the ICETOOL messages and reformat the ICE628I
message to write out the name of the dataset (which is passed to the OUTREC statement via the skeleton) and it's record count to a catalogued dataset.

This dataset is allocated MOD so the first job that runs creates it and every other job after that updates it. The contention on this dataset it minimal as it is only held for as long as it takes to write one record to it.

The end result is a file which contains the names of all the files and their record count.

I have tested this and it ran pretty quickly...hint hint.
_________________
My opinions are exactly that.

kolusu · Posted: Tue May 18, 2004 9:36 pm Post subject:

k2,

The following one-step DFSORT/ICETOOL JCl will run quickly and you will have the desired results.

k2 · Beginner Joined: 21 Jan 2004 Posts: 7 Topics: 3

Thanks for the two suggested solutions. I opted for Kolusu's, as my REXX experience is limited at best.

I wrote a program to dynamically create a file with the JCL code as suggested by Kolusu (using the second option, inserting the actual DSN into the Trailer record using SAS Macrovariables), but I have now run out of Task I/O Table Space:

IEF240I KJK4Z STEP00B - TASK I/O TABLE EXCEEDS TIOT LIMIT OF 0032K
IEF272I KJK4Z STEP00B - STEP WAS NOT EXECUTED.

The program was running extremely well under a subset of records, as I was testing, but the system message documentation doesn't offer much hope of clearing this problem up. I ended up with more files than I expected...over 450 now.

I guess I can just split this into two parts, and hope it doesn't grow much more.

Thanks for your help,
K2

kolusu · Posted: Wed May 19, 2004 7:48 pm Post subject:

k2,

The erorr you are getting is due to the excess no: of dd statements per job. The maximum number of DD statements per job step is 3273, based on the number of single DD statements allowed for a TIOT (task input output table) control block size of 64K. This limit can be different depending on the installation-defined TIOT size.The IBM-supplied default TIOT size is 32K.

The following table shows the relationship between the size of the TIOT and the maximum number of DDs allowed:

k2 · Beginner Joined: 21 Jan 2004 Posts: 7 Topics: 3

Thanks, Kolusu, for the additional information about TIOT size. I ended up submitting separate jobs, as I'm not certain how much larger this project may grow. For posterity, here's the SAS Code I used to generate the JCL Code that I then submitted to the Internal Reader; perhaps it will be of use to someone else.

Thanks,
K2

kolusu · Posted: Fri May 21, 2004 1:35 pm Post subject:

K2,

Thanks for posting the SAS code. I am sure someday it will be helpful for someone

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu

psmadhusudhan · Beginner Joined: 28 Nov 2006 Posts: 143 Topics: 48

Kolusu,

I had tried out the ICETOOL job given by you on 8 FB file of different lengths using following JCL:

dbzTHEdinosauer · Posted: Tue Apr 01, 2008 8:47 am Post subject:

there are always solutions to problems created by bad design.

If the record count is so important, why waste cycles counting the records after the fact. proper design/redesing/re-engineering would lead one to modify the jobs that create the files to output a record count.

If these flat files are comming from remote sources, you need a high-speed record counter.

why are these record counts so important?
_________________
Dick Brenholtz
American living in Varel, Germany

Frank Yaeger · Posted: Tue Apr 01, 2008 9:54 am Post subject:

kolusu · Posted: Tue Apr 01, 2008 9:58 am Post subject:

psmadhusudhan · Beginner Joined: 28 Nov 2006 Posts: 143 Topics: 48

Thanks Frank. The solution given by you is working.

Kolusu,