Find Unique Occurrence of Fields
Select messages from
# through # FAQ
[/[Print]\]
Goto page 1, 2  Next  :| |:
MVSFORUMS.com -> Utilities

#1: Find Unique Occurrence of Fields Author: mainframemouli PostPosted: Tue Dec 28, 2010 11:18 am
    —
I have the input as below :

Col1 Col2 Col3 Col4 Col5 Col6
Code:

519    20     1      3     1      3
519    20     2      2     2      2
519    20     3      3     3      3
519    21     1      3     1      3
519    21     2      2     2      2
519    21     3      3     2      2

Output should be :
Col1 Col2 Col3 Col4 Col5 Col6
519 20 3 2 3 2 (i.e) For Col1 519 and Col2 20 ==> The Unique occurrence of Col3 is '3' , Col4 is '2' , Col5 is '3' , Col6 is '2'.

Col1 Col2 Col3 Col4 Col5 Col6
519 21 3 2 2 2 (i.e) For Col1 519 and Col2 21 ==> The Unique occuerence of Col3 is '3' , Col4 is '2' , Col5 is '2' , Col6 is '2'.

I could do this by SAS program or by cobol internal sort.
1.With SAS the though the logic is simple , the input files will be in millions , which SAS may struggle to handle. and its a bigger file lot of other fields in it.

2. The cobol internal sort/Array logic is also dropped because of the file size.

Is there a way in SORT to do this if not in application programming?. (Eazytrieve is not in our shop)

#2:  Author: papadi PostPosted: Tue Dec 28, 2010 2:05 pm
    —
Suggest you re-post using the code tag to make your post more readable and preserve the column alignment. There is a Preview so you can see your post as we will see it. When it looks as you wish, Submit.

Then, explain the rules more completely. The "output should be" examples do not appear to follow the rules. . .


Last edited by papadi on Tue Dec 28, 2010 5:31 pm; edited 1 time in total

#3:  Author: mainframemouli PostPosted: Tue Dec 28, 2010 2:42 pm
    —
I have the input as below :
Code:

Col1 Col2  Col3   Col4   Col5   Col6
519  20    1      3      1      3
519  20    2      2      2      2
519  20    3      3      3      3
519  21    1      3      1      3
519  21    2      2      2      2
519  21    3      3      2      2


The Output should be :

Code:

Output should be :
Col1 Col2  Col3   Col4   Col5   Col6
519   20    3     2      3      2


Quote:


(i.e) For Col1 519 and Col2 20 ==> The Unique occuerence of Col3 is '3' , Col4 is '2' , Col5 is '3' , Col6 is '2'.



Code:

Col1 Col2  Col3   Col4   Col5   Col6
519   21    3     2      2      2


Quote:


(i.e) For Col1 519 and Col2 21 ==> The Unique occuerence of Col3 is '3' , Col4 is '2' , Col5 is '2' , Col6 is '2'.




I could do this by SAS program or by cobol internal sort.
1.With SAS the though the logic is simple , the input files will be in millions , which SAS may struggle to handle. and its a bigger file lot of other fields in it.

2. The cobol internal sort/Array logic is also dropped because of the file size.

Is there a way in SORT to do this if not in application programming?. (Eazytrieve is not in our shop)

#4:  Author: DibakarLocation: USA PostPosted: Tue Dec 28, 2010 2:43 pm
    —
What do you mean by unique? For 519/20 all values in Col3 are unique values, how do you decide which one to pick?

#5:  Author: kolusuLocation: San Jose PostPosted: Tue Dec 28, 2010 6:02 pm
    —
mainframemouli,

1. What is the LRECL and RECFM of the input file?
2. Is Col1 and Col2 considered a key? If so what is the position and format of these 2 fields?
3. What is the position and format of col3 , col4, col5?
4. What is the lrecl and recfm of the output file?

Dibakar,

I think OP needs a control break on col1 and col2 as a single key and he needs the unique count of col3, col4, and col5 for the combo of col1 and col2.

#6:  Author: mainframemouli PostPosted: Tue Dec 28, 2010 8:21 pm
    —
1. What is the LRECL and RECFM of the input file?
Though the main input file is variable file with LRECL of 2000 , I could actually derive the input in the format provided in my post above.
OK , The LRECL = 80 , RECFM is FB

2. Is Col1 and Col2 considered a key? If so what is the position and format of these 2 fields?
Yes , Col1 and Col2 forms a Key.


3. What is the position and format of col3 , col4, col5?
Col3 , 4 , 5 and 6 are all alphanumeric


4. What is the lrecl and recfm of the output file?
The output file will actually a report with each page for col1 and col2 combination , like as below

Code:

                                Header
                    Col1(519) and Col2(20) combination   - Page 1

                         Col3  Col4   Col5   Col6
                         3     2      3      2

                                  Header
                    Col1(519) and Col2(21) combination   - Page 2

                         Col3  Col4   Col5   Col6
                         3     2      2      2




are displayed.




Quote:

But Let me handle the report in a simple cobol program , because it requires some more data to provided in the report. So lets have the output as LRECL = 80 and RECFM = FB.


#7:  Author: mainframemouli PostPosted: Tue Dec 28, 2010 8:28 pm
    —
Kolusu ,
Please find the positions as below

Col1 --> 2 Bytes ( 1 to 2)
Col2 --> 3 Bytes( 3 to 5)

Col3 --> 10 Bytes ( 6 to 15)
Col4 --> 10 Bytes (16 to 25)
Col5 --> 10 Bytes (26 to 35)
Col6 --> 10 Bytes (36 to 45)

#8:  Author: kolusuLocation: San Jose PostPosted: Tue Dec 28, 2010 10:58 pm
    —
mainframemouli wrote:
1. What is the LRECL and RECFM of the input file?
Though the main input file is variable file with LRECL of 2000 , I could actually derive the input in the format provided in my post above.
OK , The LRECL = 80 , RECFM is FB


Nope You are doing an extra pass which is not needed. Explain the details with the original File. There is NO need to extract the file with the desired fields as you can do the reformatting of the desired fields using INREC.

#9:  Author: mainframemouli PostPosted: Wed Dec 29, 2010 11:22 am
    —
Thanks Kolusu.
OK , Let me give you the actuals.

1. What is the LRECL and RECFM of the input file?
The actual input file is a tape file with following attributes.
RECFM = VB LRECL = 020503 BLKSIZE = 027998

The part of layout which is of interest is as below :(The Layout is as it is from the beginning , Not altered)
Code:

01  IN-RECORD.                               
       05  COL1                      PIC 9(3)    COMP-3.
       05  XX                          PIC X.             
       05  COL2                      PIC 9(2).           
       05  XX                          PIC X.             
       05  OHR-COLS.                                   
           10  COL3             PIC X(10).         
           10  COL4             PIC X(10).         
           10  COL5             PIC X(10).         
           10  COL6            PIC X(10).
etc...... Followed by other fields .......         


The actual output should be :
Code:

                                SUMMARY REPORT                             PAGE 1
                                                                RUN DATE : XXXX
                                                                RUN TIME : XXXXX
                                                                DATA MONTH : 
                                                               
COL1 and COL2             COL3     COL4   COL5   COL6


                                 SUMMARY REPORT                             PAGE 2
                                                                RUN DATE : XXXX
                                                                RUN TIME : XXXXX
                                                                DATA MONTH : 
                                                               
COL1 and COL2             COL3     COL4   COL5   COL6

etc.....



Some of the fields are not required , the COL1 is in Comp-3 format , so just to make things staright and easy to understand for you to provide solution , I have provided the input layout in a simple way.

Thought , Once I get the solution , I would fit that init to my actual requirement.

Apologize , If that is not the convenient means for you.

#10:  Author: kolusuLocation: San Jose PostPosted: Wed Dec 29, 2010 5:23 pm
    —
mainframemouli,


Here is an untested DFSORT JCL which will give you the desired results. Since you wanted the break on 4 different fields , we concatenate the same input file 4 times each separated by a header. Using the header we add an ID number to each file and move the col3 , col4, col5 and col 6 to a common area based on the ID. We also trim your input to just 19 bytes and then perform the sort removing the duplicates. Once we do that we then have the unique counts for each.

I also assumed that you wanted to see the comp-3 field displayed , so I changed it to printable form of 3 bytes.

I am assuming that you will come back with more changes as your report is expecting the data month filed in the summary report. I guess you have to figure that out as to how to include it in the logic I proposed. And that is one of the reason I ask for complete details.

Code:

//STEP0100 EXEC PGM=SORT                                     
//SYSOUT   DD SYSOUT=*                                       
//SORTIN   DD *                                               
//SORTOUT  DD DSN=&&HD,DISP=(,PASS),SPACE=(TRK,(1,1),RLSE)   
//SYSIN    DD *                                               
  SORT FIELDS=COPY                                           
  OUTFIL REMOVECC,FTOV,HEADER1=('$$$')                       
//*                                                           
//STEP0200 EXEC PGM=SORT                                     
//SYSOUT   DD SYSOUT=*                                       
//SORTIN   DD DSN=&&HD,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT   
//         DD DSN=your input vb 20503 lrecl file,DISP=SHR
//         DD DSN=&&HD,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT
//         DD DSN=your input vb 20503 lrecl file,DISP=SHR   
//         DD DSN=&&HD,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT
//         DD DSN=your input vb 20503 lrecl file,DISP=SHR   
//         DD DSN=&&HD,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT   
//         DD DSN=your input vb 20503 lrecl file,DISP=SHR
//SORTOUT  DD SYSOUT=*                                   
//SYSIN    DD *                                                     
  INREC IFOUTLEN=19,                                               
  IFTHEN=(WHEN=INIT,BUILD=(1,4,X,5)),                               
  IFTHEN=(WHEN=GROUP,BEGIN=(6,3,CH,EQ,C'$$$'),PUSH=(5:ID=1)),       
  IFTHEN=(WHEN=(5,1,ZD,EQ,1),BUILD=(1,4,5,3,9,2,12,10)),           
  IFTHEN=(WHEN=(5,1,ZD,EQ,2),BUILD=(1,4,5,3,9,2,22,10)),           
  IFTHEN=(WHEN=(5,1,ZD,EQ,3),BUILD=(1,4,5,3,9,2,32,10)),           
  IFTHEN=(WHEN=(5,1,ZD,EQ,4),BUILD=(1,4,5,3,9,2,42,10))
             
  SORT FIELDS=(6,2,PD,A,8,2,CH,A,5,1,CH,A,10,10,CH,A)               
  SUM FIELDS=NONE                                                   
                                                                   
  OUTREC IFOUTLEN=13,                                               
  IFTHEN=(WHEN=(5,1,ZD,EQ,1,AND,6,2,CH,NE,C'$$'),                   
  BUILD=(1,4,6,2,PD,EDIT=(TTT),8,2,C'1000')),                       
  IFTHEN=(WHEN=(5,1,ZD,EQ,2,AND,6,2,CH,NE,C'$$'),                   
  BUILD=(1,4,6,2,PD,EDIT=(TTT),8,2,C'0100')),                       
  IFTHEN=(WHEN=(5,1,ZD,EQ,3,AND,6,2,CH,NE,C'$$'),                   
  BUILD=(1,4,6,2,PD,EDIT=(TTT),8,2,C'0010')),                       
  IFTHEN=(WHEN=(5,1,ZD,EQ,4,AND,6,2,CH,NE,C'$$'),                   
  BUILD=(1,4,6,2,PD,EDIT=(TTT),8,2,C'0001'))                       
                                                                   
  OUTFIL VTOF,REMOVECC,NODETAIL,BUILD=(80X),OMIT=(6,2,CH,EQ,C'$$'),
  SECTIONS=(5,5,SKIP=P,                                             
  HEADER3=(30:'SUMMARY REPORT ',56:'PAGE : ',PAGE,/,               
           50:'RUN DATE   : ',DATE=(4MD-),/,                       
           50:'RUN TIME   : ',TIME=(24:),/,                         
           50:'DATA MONTH : ',/),                                   
  TRAILER3=(5,3,X,8,2,X,                                           
            TOT=(10,1,ZD,M10,LENGTH=8),X,                           
            TOT=(11,1,ZD,M10,LENGTH=8),X,                           
            TOT=(12,1,ZD,M10,LENGTH=8),X,                           
            TOT=(13,1,ZD,M10,LENGTH=8),/))                         
//*

#11:  Author: mainframemouli PostPosted: Thu Dec 30, 2010 1:32 pm
    —
Thanks Kolusu.
I will execute this and let you know the results.

#12:  Author: mainframemouli PostPosted: Tue Jan 04, 2011 2:17 pm
    —
Thanks Kolusu , It worked.

Just did couple of modifications :
1. Replaced '//*' to '/*'
2. This change is for my requirement as below

Code:

 HEADER3=(25:'*** SUMMARY REPORT ',56:'PAGE : ',PAGE,/,               
          50:'RUN DATE   : ',DATE=(4MD-),/,                           
          50:'RUN TIME   : ',TIME=(24:),/,                           
          50:'                        ',/,                           
          04:'COL1     COL2       COL3    COL4     COL5     COL6',/, 
          04:'******  ******     ****    ****      ****     ****'),   
 TRAILER3=(4:5,3,5X,12:8,2,3X,                                       
           17:TOT=(10,1,ZD,M10,LENGTH=8),X,                           
           26:TOT=(11,1,ZD,M10,LENGTH=8),X,                           
           35:TOT=(12,1,ZD,M10,LENGTH=8),X,                           
           44:TOT=(13,1,ZD,M10,LENGTH=8),/))                         


3. Data Month may be a luxury to the requirement , because the same requirement should work for Weekly and monthly. For weekly it should Data week.

And finally thanks for all your help , It took me sometime understand the logic behind the code , it's very nice.

#13:  Author: kolusuLocation: San Jose PostPosted: Tue Jan 04, 2011 2:56 pm
    —
mainframemouli wrote:
Thanks Kolusu , It worked.

Just did couple of modifications :
1. Replaced '//*' to '/*'


Mainframemouli,

Is that your shop standard?

mainframemouli wrote:

3. Data Month may be a luxury to the requirement , because the same requirement should work for Weekly and monthly. For weekly it should Data week.


You can get the month names(Jan - Dec) easily and the week range(Monday- Sunday dates) easily.

#14:  Author: mainframemouli PostPosted: Wed Jan 05, 2011 1:25 pm
    —
Quote:

Is that your shop standard?


Yes , Kolusu , That's the shop standard.

Quote:

You can get the month names(Jan - Dec) easily and the week range(Monday- Sunday dates) easily.


What is required is
For Monthly : The Datamonth should be MM/YY (Like 01/11)
For weekly : The Dataweek should be Week Number for that year.
For example for 2011 Jan 1st week should be , WEEK 1
For example for 2011 Jan 2nd week should be , WEEK 2 etc...

#15:  Author: kolusuLocation: San Jose PostPosted: Wed Jan 05, 2011 2:06 pm
    —
mainframemouli wrote:
What is required is
For Monthly : The Datamonth should be MM/YY (Like 01/11)
For weekly : The Dataweek should be Week Number for that year.
For example for 2011 Jan 1st week should be , WEEK 1
For example for 2011 Jan 2nd week should be , WEEK 2 etc...


mainframemouli,

Monthly part is easy. You can just add SYMNAMES DD name and use them in the Header3 like this
Code:

//SYMNAMES DD *
YNUM,S'&LYR2'
MNUM,S'&LMON'


and Header3 we will change this line to
Code:

50:'DATA MONTH : ',/),


to
Code:

50:'DATA MONTH : ',MNUM,'/',YNUM,/),


However the Week Number part is not that easy. There are different forms of calculating the week number depending on the region.

WEEK DB2 Scalar function

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DSNSQJ12/3.2.121

WEEK_ISO DB2 Scalar function

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DSNSQJ12/3.2.122

You can add a step to unload the Week number from SYSIBM.SYSDUMMY1 and create it as symbol which you can use it in the report.
Code:

SELECT CHAR('WEEK#,C')                       
      ,CHAR('''')                           
      ,CHAR(SMALLINT(WEEK(CURRENT DATE)))   
      ,CHAR('''')                           
      ,CHAR(' ',65)                         
  FROM SYSIBM.SYSDUMMY1;                     


This will create an 80 byte symbol as
Code:

WEEK#,C'2'


and you can use that in the header3 as

Code:

50:'DATA WEEK  : ',WEEK#,/),



MVSFORUMS.com -> Utilities


output generated using printer-friendly topic mod. All times are GMT - 5 Hours

Goto page 1, 2  Next  :| |:
Page 1 of 2

Powered by phpBB © 2001, 2005 phpBB Group