Joined: 08 Aug 2007 Posts: 291 Topics: 2 Location: Chicago
Posted: Thu Aug 09, 2007 11:43 am Post subject:
There's no debate that loading/retrieving data within an internal table will outperform any IO. But that would require modifications to the existing application architecture. Looking at the biggest bang for the buck, there will be lower risk and effort (cost) to attempt relatively simple modifications that do not impact the existing application architecture. If changing the CI size and increasing buffers reduces the run times to acceptable levels, great. If not, there's little loss.
Also, referring to the suspected problem file, hellohatim stated:
Quote:
this file will not have any inserts/deletes & is purely meant for random read purpose
I was not thinking about tuning for sequential processing based on that statement.
The KSDS file does have 99,999 records only. the 6.9 M statistic was # of hits on KSDS not the number of records. The input sequential file is having > 7M records and for each record from this file, the zip code variable is used to index into the KSDS file.
Code:
3700-QC-ZIPSMSA.
MOVE I01-AD-5-DIG-US-ZIP-CD TO WS01-ACTUAL-CD.
MOVE WS01-POSTAL-CD TO ZIPSMSA-POST-CD.
PERFORM 7400-READ-ZIPSMSA THRU 7400-EXIT.
EVALUATE TRUE
WHEN WS01-ZIPSMSA-KEYFND
PERFORM 3710-QC-ZIPSMSA-PROC
THRU 3710-EXIT
WHEN WS01-ZIPSMSA-NOTFND
MOVE '0004' TO WW02-ERROR-CODE
MOVE 'ZIP CODE' TO WW02-DESCRIPTION(01:08)
MOVE ZIPSMSA-POST-CD TO WW02-DESCRIPTION(10:06)
MOVE 'MISSING IN ZIPSMSA '
TO WW02-DESCRIPTION(18:19)
PERFORM 8100-WRITE-ERROR-PARA
THRU 8100-EXIT
END-EVALUATE.
3700-EXIT.
EXIT.
3710-QC-ZIPSMSA-PROC.
IF ZIPSMSA-SMSA-CD NOT = I01-SMSA-CD
MOVE ZIPSMSA-SMSA-CD TO I01-SMSA-CD
ADD 1 TO WA01-SMSA-UP
ELSE
ADD 1 TO WA01-SMSA-NU
END-IF.
3710-EXIT.
EXIT.
The increase in I-O percentage for KSDS as well as Seq read write from the numbers I gave above would be approximate 25% (reading KSDS file and writing error records for not found). But the total cost increased 200%. This is what I am not able to understand.
I think internal tables with binary search will be the best bet. We are thinking on restructuring the application for the same. The cost of the job has more then doubled, with increased elapsed time as well. Guess the cost to restructure will be much less.
Will also try using BUFNI & BUFND to check the results, also tweaking the CISZ size. Will post the results on this forum by next week.
Joined: 31 May 2004 Posts: 391 Topics: 4 Location: Richfield, MN, USA
Posted: Thu Aug 09, 2007 4:05 pm Post subject:
Well, it's obvious that CI splits or CA splits are not the problem! I'm no VSAM expert, but BUFSPACE seems a bit small. Also, aren't the usual SHROPTNS (2,3) instead of (3,3)? For random access, DO NOT use DYNAMIC access, use RANDOM access. Also, random access files can benefit significantly from using BLSR. _________________ ....Terry
Joined: 08 Aug 2007 Posts: 291 Topics: 2 Location: Chicago
Posted: Fri Aug 10, 2007 10:02 am Post subject:
Reviewing the LISTCAT, I noticed that you're allocating over 100 times more space than you're using. Data HI-A-RBA is 314081280 but HI-U-RBA is only 1474560. Index HI-A-RBA is 678912 but HI-U-RBA is only 4608. I know space is cheap but that's over 400 CYLS that can't be used for anything else. You could easily get by with an allocation of CYLINDERS(5 1).
If you decide to experiment with BUFNx parms, I would initially leave the data CI size as is (4096) and try BUFNI=3. For this file, there are only 3 index CI's; 1 root CI and 2 leaf CI's. The problem is that the default buffers allocated for VSAM are 1 index buffer and 2 data buffers. For every read, you're swapping out the root CI and one of the leaf CI's. That's 2 physical IO's for every logical read just for the index. Then add 1 physical IO if the data CI isn't in one of the two data buffers. Using BUFNI=3, you'll store the entire index in the buffers. For the BUFND value, I suspect that the nature of the input data results in something close to skip sequential processing. Most likely, there are pockets of records on the input file where the zip values are similar. If that's the case, leave the data CI size alone and try BUFND=10 or even 20. If that's not the case and the distribution really is random, you could reduce the data CI size to 512 and get by with BUFND=5. Changing the data CI size would also result in a change to the index CI size and you'd want to increase the BUFNI value to number of index CI's. If you see good results playing around with the BUFNx parms for this file, you might want to consider experimenting with these parms for the other VSAM files too. Just for fun.
By the way, based on the interest generated by your post, you can tell what excites the masses. Thanks for the fun.
Joined: 20 Oct 2006 Posts: 1411 Topics: 26 Location: germany
Posted: Fri Aug 10, 2007 11:12 am Post subject:
jsharon1248,
great input. as you indicated, this file is so small that with the right allocaton for the index and data, it could be contained completely in memory, thus no application logic change.
I have always found it easier to just load files into COBOL tables since I could never get the systems types to properly allocate the vsam structures.
The other three files: PLANIND, TM, and ZIPDMA might also receive performance gains if they were allocated differently.
Does not matter much if the file is in memory or a COBOL table - get rid of the physical I/O's. _________________ Dick Brenholtz
American living in Varel, Germany
Thanks a lot for your inputs. Yes this one was a major flaw, have been allocating undue space to the zipcode files. I will definitely tune the VSAM files for various parameters.
One more thing, we merged all three zip code files, TM, DMA, SMSA into one zip file with three columns. We have observed four times reduction in the job cost. This one is actually a design flaw, we could have done with just one file right from the begining
I will post the results by next week for suggestions posted by jsharon1248.
Thanks & Regards,
Hatim. _________________ -Hatim M P
Joined: 08 Aug 2007 Posts: 291 Topics: 2 Location: Chicago
Posted: Fri Aug 10, 2007 11:57 am Post subject:
Hatim
Thanks for the followup and looking forward to the results next week.
One more 'by the way'. I'm heading straight for the 'About MVSForums' to start a new thread to praise this site. I'm a newbie, but I've been around long enough to recognize exceptional quality.
The job costs for last four have been given becoz they include only the impacted step. One thing I observed was that even after increasing the buffers, the cost only increased. Another observation was RRDS proved more costly then KSDS. I thought RRDS would have been cheaper. Further using Arrays, CPU did not reduce much, but I/O reduced one half.
I have been reading the "VSAM demystified" red book, checking out this topic to see if it can help...
Quote:
LSR with direct access
LSR buffering mode is designed for direct access. If your applications use NSR
buffering and direct access, and you are having performance problems, you can
take advantage of LSR buffering techniques using SMB or BLSR.
Refer to _________________ -Hatim M P
Joined: 08 Aug 2007 Posts: 291 Topics: 2 Location: Chicago
Posted: Mon Aug 20, 2007 9:12 am Post subject:
A few things. First, I don't see a cost for the first 2 runs. If we're going to make meaningful comparisons, we need to see the initial costs. Second, I don't know what the counts represent. I think one of them is EXCP's, but I'm not sure. I'd also like to see run times, a LISTCAT, and the actual BUFNx values used. From what I see, there is a downward trend. The last 3 runs show significant improvements over the original. As expected, the internal WS tables show the best results, but merging the files into 1 KSDS is pretty close.
Joined: 08 Aug 2007 Posts: 291 Topics: 2 Location: Chicago
Posted: Mon Aug 20, 2007 9:15 am Post subject:
I'm sorry, I missed your last post. I think you're way too high on the BUFNI values. Send a LISTCAT and we can see if we can squeeze a little more out of this.
The listcat dumps are pretty big.Can you please let me know your email id so that I can mail you the dumps? Or if you need only some of the stats, I can copy paste them here.
I did not include the cost of other two jobs becuase they included other steps as well. The statistic I had provided was only for STEP0090, the one which had VSAM file issue. Remaining jobs I had run with STEP0090 only for testing. I will send the elapsed times later as well.
Taking my last post as example, 21447K is EXCP count and 12.96 CPU Time, i.e 5 & 7 column respectively.
Joined: 08 Aug 2007 Posts: 291 Topics: 2 Location: Chicago
Posted: Mon Aug 20, 2007 12:14 pm Post subject:
I'd only want to see the values for BUFSPACE, CISIZE, REC-TOTAL,HI-A-RBA,HI-U-RBA, SPACE-PRI for the DATA and INDEX sections. Also, LEVELS from the INDEX section. I don't think we'll seen any drastic improvements, but there could still be slight gains.
All times are GMT - 5 Hours Goto page Previous1, 2, 3Next
Page 2 of 3
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum