View previous topic :: View next topic |
Author |
Message |
sunilchik Beginner

Joined: 08 May 2003 Posts: 11 Topics: 4
|
Posted: Thu May 08, 2003 1:33 am Post subject: Comparing two Big sequential files in vs-cobol-II |
|
|
I am on an MVS mainframe system(OS 390).We have two sequential files which needs to be compared.One containing ratefile F1(which contains 20,00000 records) and an output file F2 (which contains around 10,00000 records).Both the files are in sorted order.
Comparision logic is like this,
If F1 record is equal F2 record
Write the record to output file
Else
If F1 record is less then F2 record
F1 record does not exists
else
if F1 is greater then F2 record
Check for next record in the F2 file until EOF or record matched
End-if.
This process is taking a lot of time.Is there any way to make this program execute a bit faster??? |
|
Back to top |
|
 |
kolusu Site Admin

Joined: 26 Nov 2002 Posts: 12377 Topics: 75 Location: San Jose
|
Posted: Thu May 08, 2003 5:39 am Post subject: |
|
|
Sunil,
How long is this program running?? Are you writting these files to a tape or to a dasd file?? If you are writting it to a tape then check the file section and make sure that you have BLOCK CONTAINS 0 RECORDS for the output file declaration.
Also you can check this thread for solutions for your requirements.
http://www.mvsforums.com/helpboards/viewtopic.php?t=11
Kolusu |
|
Back to top |
|
 |
sunilchik Beginner

Joined: 08 May 2003 Posts: 11 Topics: 4
|
Posted: Thu May 08, 2003 7:00 am Post subject: |
|
|
kolusu wrote: | Sunil,
How long is this program running?? Are you writting these files to a tape or to a dasd file?? If you are writting it to a tape then check the file section and make sure that you have BLOCK CONTAINS 0 RECORDS for the output file declaration.
Also you can check this thread for solutions for your requirements.
http://www.mvsforums.com/helpboards/viewtopic.php?t=11
Kolusu | Hi Kolusu,
we are writing the output to DASD. We are getting a timed out abend after around 20 records has been processed in the first file.We are trying to run the jcl by increasing time of running(by changing the class parameter).We are also using a similiar logic as pointed by you in the linked thread.
Thanks for the response. |
|
Back to top |
|
 |
vattikonda Beginner
Joined: 09 Jan 2003 Posts: 15 Topics: 5
|
Posted: Sat May 10, 2003 7:20 pm Post subject: |
|
|
Sunil,
Your program looks pretty basic. I would check the Buffers for the 2 input files. I recommend coding BUFNO=20 (the rule of thumb I was told works best is Bufno * Blocksize = 480K) within the DCB parameter. One other thing to check is the Region size for your program. You can check within the Job Characteristics to see howmuch region was requested and howmuch is being used and increase it accordingly.
If this does not fix your problem, can you provide the JCL?
Good luck |
|
Back to top |
|
 |
Glenn Beginner
Joined: 23 Mar 2003 Posts: 56 Topics: 3
|
Posted: Sat May 10, 2003 9:26 pm Post subject: |
|
|
Quote: |
I am on an MVS mainframe system(OS 390).We have two sequential files which needs to be compared.One containing ratefile F1(which contains 20,00000 records) and an output file F2 (which contains around 10,00000 records).Both the files are in sorted order.
|
Are you sure you have the logic correct? If what you posted is correct any of the logic Kolusu posted would not logically be correct, I would think. Are you testing against the full data sets or against pared down data sets? Because I have a feeling your timeout message is because of an endless loop in your code somewhere and not because of any processing requirements. What happens to all of the F1 records that remain if some remain when F2 is EOF? Are you comparing entire records or sorted keys?
From what I see here, the requirements haven't been fully specified. Have you tested this program with a small amount of data (say 100 items in F1 and F2), as you really should anyway? Do you get the same result (time out)? I remember running production programs that processed 500,000 records in a sequential file on DASD in a few seconds.
I get the feeling here your logic isn't good in your program. |
|
Back to top |
|
 |
semigeezer Supermod
Joined: 03 Jan 2003 Posts: 1014 Topics: 13 Location: Atlantis
|
Posted: Sun May 11, 2003 3:15 am Post subject: |
|
|
If I read the logic correctly, then the program is extremely inefficient. You said that both files are sorted, yet you are reading F2 to EOF for every record in F1 that does not exist in F2. There is a great deal of the logic missing here, but if you are potenitally rereading all of F2 for every record in F1 that is not found in F2, you may potentially be reading it millions of times. Rather than read to EOF, just read until F2 >= F1 and restart the comparisons. Then you read each file only once. Assuming that the files are blocked so that I/O is efficient, this should not take more than a few minutes at most.
Or maybe I'm reading it wrong. You also say that F2 is an 'output file' and that you are writing the record to the 'output file' Are you writing the matched records back to F2 for some reason (odd since they are already there) and if so, where are you writing them (end, begining, update in place). |
|
Back to top |
|
 |
sunilchik Beginner

Joined: 08 May 2003 Posts: 11 Topics: 4
|
Posted: Mon May 12, 2003 9:26 am Post subject: |
|
|
Hi Guys,
Thanks for all the responses.
Both are input files only and not an output file. Sorry to confuse you.
I will check what can be done regarding checking the Buffers for the 2 input files.
Meanwhile I have implemented the stuff regarding read to EOF (read until F2 >= F1 ).I think it has made the process little faster.
About the program we had to go for a workaround by splitting one file into 5000000 records.Because after 5400000 records were compared the program was giving the s322 abend even when the JCL was given most time(Class =h) to execute. |
|
Back to top |
|
 |
|
|