MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Find Unique Occurrence of Fields
Goto page 1, 2  Next
 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
mainframemouli
Beginner


Joined: 23 Feb 2005
Posts: 87
Topics: 33

PostPosted: Tue Dec 28, 2010 11:18 am    Post subject: Find Unique Occurrence of Fields Reply with quote

I have the input as below :

Col1 Col2 Col3 Col4 Col5 Col6
Code:

519    20     1      3     1      3
519    20     2      2     2      2
519    20     3      3     3      3
519    21     1      3     1      3
519    21     2      2     2      2
519    21     3      3     2      2

Output should be :
Col1 Col2 Col3 Col4 Col5 Col6
519 20 3 2 3 2 (i.e) For Col1 519 and Col2 20 ==> The Unique occurrence of Col3 is '3' , Col4 is '2' , Col5 is '3' , Col6 is '2'.

Col1 Col2 Col3 Col4 Col5 Col6
519 21 3 2 2 2 (i.e) For Col1 519 and Col2 21 ==> The Unique occuerence of Col3 is '3' , Col4 is '2' , Col5 is '2' , Col6 is '2'.

I could do this by SAS program or by cobol internal sort.
1.With SAS the though the logic is simple , the input files will be in millions , which SAS may struggle to handle. and its a bigger file lot of other fields in it.

2. The cobol internal sort/Array logic is also dropped because of the file size.

Is there a way in SORT to do this if not in application programming?. (Eazytrieve is not in our shop)
Back to top
View user's profile Send private message
papadi
Supermod


Joined: 20 Oct 2009
Posts: 594
Topics: 1

PostPosted: Tue Dec 28, 2010 2:05 pm    Post subject: Reply with quote

Suggest you re-post using the code tag to make your post more readable and preserve the column alignment. There is a Preview so you can see your post as we will see it. When it looks as you wish, Submit.

Then, explain the rules more completely. The "output should be" examples do not appear to follow the rules. . .
_________________
All the best,

di


Last edited by papadi on Tue Dec 28, 2010 5:31 pm; edited 1 time in total
Back to top
View user's profile Send private message
mainframemouli
Beginner


Joined: 23 Feb 2005
Posts: 87
Topics: 33

PostPosted: Tue Dec 28, 2010 2:42 pm    Post subject: Reply with quote

I have the input as below :
Code:

Col1 Col2  Col3   Col4   Col5   Col6
519  20    1      3      1      3
519  20    2      2      2      2
519  20    3      3      3      3
519  21    1      3      1      3
519  21    2      2      2      2
519  21    3      3      2      2


The Output should be :

Code:

Output should be :
Col1 Col2  Col3   Col4   Col5   Col6
519   20    3     2      3      2


Quote:


(i.e) For Col1 519 and Col2 20 ==> The Unique occuerence of Col3 is '3' , Col4 is '2' , Col5 is '3' , Col6 is '2'.



Code:

Col1 Col2  Col3   Col4   Col5   Col6
519   21    3     2      2      2


Quote:


(i.e) For Col1 519 and Col2 21 ==> The Unique occuerence of Col3 is '3' , Col4 is '2' , Col5 is '2' , Col6 is '2'.




I could do this by SAS program or by cobol internal sort.
1.With SAS the though the logic is simple , the input files will be in millions , which SAS may struggle to handle. and its a bigger file lot of other fields in it.

2. The cobol internal sort/Array logic is also dropped because of the file size.

Is there a way in SORT to do this if not in application programming?. (Eazytrieve is not in our shop)
Back to top
View user's profile Send private message
Dibakar
Advanced


Joined: 02 Dec 2002
Posts: 699
Topics: 63
Location: USA

PostPosted: Tue Dec 28, 2010 2:43 pm    Post subject: Reply with quote

What do you mean by unique? For 519/20 all values in Col3 are unique values, how do you decide which one to pick?
_________________
Regards,
Diba
Back to top
View user's profile Send private message Send e-mail
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12370
Topics: 75
Location: San Jose

PostPosted: Tue Dec 28, 2010 6:02 pm    Post subject: Reply with quote

mainframemouli,

1. What is the LRECL and RECFM of the input file?
2. Is Col1 and Col2 considered a key? If so what is the position and format of these 2 fields?
3. What is the position and format of col3 , col4, col5?
4. What is the lrecl and recfm of the output file?

Dibakar,

I think OP needs a control break on col1 and col2 as a single key and he needs the unique count of col3, col4, and col5 for the combo of col1 and col2.
_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
mainframemouli
Beginner


Joined: 23 Feb 2005
Posts: 87
Topics: 33

PostPosted: Tue Dec 28, 2010 8:21 pm    Post subject: Reply with quote

1. What is the LRECL and RECFM of the input file?
Though the main input file is variable file with LRECL of 2000 , I could actually derive the input in the format provided in my post above.
OK , The LRECL = 80 , RECFM is FB

2. Is Col1 and Col2 considered a key? If so what is the position and format of these 2 fields?
Yes , Col1 and Col2 forms a Key.


3. What is the position and format of col3 , col4, col5?
Col3 , 4 , 5 and 6 are all alphanumeric


4. What is the lrecl and recfm of the output file?
The output file will actually a report with each page for col1 and col2 combination , like as below

Code:

                                Header
                    Col1(519) and Col2(20) combination   - Page 1

                         Col3  Col4   Col5   Col6
                         3     2      3      2

                                  Header
                    Col1(519) and Col2(21) combination   - Page 2

                         Col3  Col4   Col5   Col6
                         3     2      2      2




are displayed.




Quote:

But Let me handle the report in a simple cobol program , because it requires some more data to provided in the report. So lets have the output as LRECL = 80 and RECFM = FB.

Back to top
View user's profile Send private message
mainframemouli
Beginner


Joined: 23 Feb 2005
Posts: 87
Topics: 33

PostPosted: Tue Dec 28, 2010 8:28 pm    Post subject: Reply with quote

Kolusu ,
Please find the positions as below

Col1 --> 2 Bytes ( 1 to 2)
Col2 --> 3 Bytes( 3 to 5)

Col3 --> 10 Bytes ( 6 to 15)
Col4 --> 10 Bytes (16 to 25)
Col5 --> 10 Bytes (26 to 35)
Col6 --> 10 Bytes (36 to 45)
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12370
Topics: 75
Location: San Jose

PostPosted: Tue Dec 28, 2010 10:58 pm    Post subject: Reply with quote

mainframemouli wrote:
1. What is the LRECL and RECFM of the input file?
Though the main input file is variable file with LRECL of 2000 , I could actually derive the input in the format provided in my post above.
OK , The LRECL = 80 , RECFM is FB


Nope You are doing an extra pass which is not needed. Explain the details with the original File. There is NO need to extract the file with the desired fields as you can do the reformatting of the desired fields using INREC.
_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
mainframemouli
Beginner


Joined: 23 Feb 2005
Posts: 87
Topics: 33

PostPosted: Wed Dec 29, 2010 11:22 am    Post subject: Reply with quote

Thanks Kolusu.
OK , Let me give you the actuals.

1. What is the LRECL and RECFM of the input file?
The actual input file is a tape file with following attributes.
RECFM = VB LRECL = 020503 BLKSIZE = 027998

The part of layout which is of interest is as below :(The Layout is as it is from the beginning , Not altered)
Code:

01  IN-RECORD.                               
       05  COL1                      PIC 9(3)    COMP-3.
       05  XX                          PIC X.             
       05  COL2                      PIC 9(2).           
       05  XX                          PIC X.             
       05  OHR-COLS.                                   
           10  COL3             PIC X(10).         
           10  COL4             PIC X(10).         
           10  COL5             PIC X(10).         
           10  COL6            PIC X(10).
etc...... Followed by other fields .......         


The actual output should be :
Code:

                                SUMMARY REPORT                             PAGE 1
                                                                RUN DATE : XXXX
                                                                RUN TIME : XXXXX
                                                                DATA MONTH : 
                                                               
COL1 and COL2             COL3     COL4   COL5   COL6


                                 SUMMARY REPORT                             PAGE 2
                                                                RUN DATE : XXXX
                                                                RUN TIME : XXXXX
                                                                DATA MONTH : 
                                                               
COL1 and COL2             COL3     COL4   COL5   COL6

etc.....



Some of the fields are not required , the COL1 is in Comp-3 format , so just to make things staright and easy to understand for you to provide solution , I have provided the input layout in a simple way.

Thought , Once I get the solution , I would fit that init to my actual requirement.

Apologize , If that is not the convenient means for you.
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12370
Topics: 75
Location: San Jose

PostPosted: Wed Dec 29, 2010 5:23 pm    Post subject: Reply with quote

mainframemouli,


Here is an untested DFSORT JCL which will give you the desired results. Since you wanted the break on 4 different fields , we concatenate the same input file 4 times each separated by a header. Using the header we add an ID number to each file and move the col3 , col4, col5 and col 6 to a common area based on the ID. We also trim your input to just 19 bytes and then perform the sort removing the duplicates. Once we do that we then have the unique counts for each.

I also assumed that you wanted to see the comp-3 field displayed , so I changed it to printable form of 3 bytes.

I am assuming that you will come back with more changes as your report is expecting the data month filed in the summary report. I guess you have to figure that out as to how to include it in the logic I proposed. And that is one of the reason I ask for complete details.

Code:

//STEP0100 EXEC PGM=SORT                                     
//SYSOUT   DD SYSOUT=*                                       
//SORTIN   DD *                                               
//SORTOUT  DD DSN=&&HD,DISP=(,PASS),SPACE=(TRK,(1,1),RLSE)   
//SYSIN    DD *                                               
  SORT FIELDS=COPY                                           
  OUTFIL REMOVECC,FTOV,HEADER1=('$$$')                       
//*                                                           
//STEP0200 EXEC PGM=SORT                                     
//SYSOUT   DD SYSOUT=*                                       
//SORTIN   DD DSN=&&HD,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT   
//         DD DSN=your input vb 20503 lrecl file,DISP=SHR
//         DD DSN=&&HD,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT
//         DD DSN=your input vb 20503 lrecl file,DISP=SHR   
//         DD DSN=&&HD,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT
//         DD DSN=your input vb 20503 lrecl file,DISP=SHR   
//         DD DSN=&&HD,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT   
//         DD DSN=your input vb 20503 lrecl file,DISP=SHR
//SORTOUT  DD SYSOUT=*                                   
//SYSIN    DD *                                                     
  INREC IFOUTLEN=19,                                               
  IFTHEN=(WHEN=INIT,BUILD=(1,4,X,5)),                               
  IFTHEN=(WHEN=GROUP,BEGIN=(6,3,CH,EQ,C'$$$'),PUSH=(5:ID=1)),       
  IFTHEN=(WHEN=(5,1,ZD,EQ,1),BUILD=(1,4,5,3,9,2,12,10)),           
  IFTHEN=(WHEN=(5,1,ZD,EQ,2),BUILD=(1,4,5,3,9,2,22,10)),           
  IFTHEN=(WHEN=(5,1,ZD,EQ,3),BUILD=(1,4,5,3,9,2,32,10)),           
  IFTHEN=(WHEN=(5,1,ZD,EQ,4),BUILD=(1,4,5,3,9,2,42,10))
             
  SORT FIELDS=(6,2,PD,A,8,2,CH,A,5,1,CH,A,10,10,CH,A)               
  SUM FIELDS=NONE                                                   
                                                                   
  OUTREC IFOUTLEN=13,                                               
  IFTHEN=(WHEN=(5,1,ZD,EQ,1,AND,6,2,CH,NE,C'$$'),                   
  BUILD=(1,4,6,2,PD,EDIT=(TTT),8,2,C'1000')),                       
  IFTHEN=(WHEN=(5,1,ZD,EQ,2,AND,6,2,CH,NE,C'$$'),                   
  BUILD=(1,4,6,2,PD,EDIT=(TTT),8,2,C'0100')),                       
  IFTHEN=(WHEN=(5,1,ZD,EQ,3,AND,6,2,CH,NE,C'$$'),                   
  BUILD=(1,4,6,2,PD,EDIT=(TTT),8,2,C'0010')),                       
  IFTHEN=(WHEN=(5,1,ZD,EQ,4,AND,6,2,CH,NE,C'$$'),                   
  BUILD=(1,4,6,2,PD,EDIT=(TTT),8,2,C'0001'))                       
                                                                   
  OUTFIL VTOF,REMOVECC,NODETAIL,BUILD=(80X),OMIT=(6,2,CH,EQ,C'$$'),
  SECTIONS=(5,5,SKIP=P,                                             
  HEADER3=(30:'SUMMARY REPORT ',56:'PAGE : ',PAGE,/,               
           50:'RUN DATE   : ',DATE=(4MD-),/,                       
           50:'RUN TIME   : ',TIME=(24:),/,                         
           50:'DATA MONTH : ',/),                                   
  TRAILER3=(5,3,X,8,2,X,                                           
            TOT=(10,1,ZD,M10,LENGTH=8),X,                           
            TOT=(11,1,ZD,M10,LENGTH=8),X,                           
            TOT=(12,1,ZD,M10,LENGTH=8),X,                           
            TOT=(13,1,ZD,M10,LENGTH=8),/))                         
//*

_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
mainframemouli
Beginner


Joined: 23 Feb 2005
Posts: 87
Topics: 33

PostPosted: Thu Dec 30, 2010 1:32 pm    Post subject: Reply with quote

Thanks Kolusu.
I will execute this and let you know the results.
Back to top
View user's profile Send private message
mainframemouli
Beginner


Joined: 23 Feb 2005
Posts: 87
Topics: 33

PostPosted: Tue Jan 04, 2011 2:17 pm    Post subject: Reply with quote

Thanks Kolusu , It worked.

Just did couple of modifications :
1. Replaced '//*' to '/*'
2. This change is for my requirement as below

Code:

 HEADER3=(25:'*** SUMMARY REPORT ',56:'PAGE : ',PAGE,/,               
          50:'RUN DATE   : ',DATE=(4MD-),/,                           
          50:'RUN TIME   : ',TIME=(24:),/,                           
          50:'                        ',/,                           
          04:'COL1     COL2       COL3    COL4     COL5     COL6',/, 
          04:'******  ******     ****    ****      ****     ****'),   
 TRAILER3=(4:5,3,5X,12:8,2,3X,                                       
           17:TOT=(10,1,ZD,M10,LENGTH=8),X,                           
           26:TOT=(11,1,ZD,M10,LENGTH=8),X,                           
           35:TOT=(12,1,ZD,M10,LENGTH=8),X,                           
           44:TOT=(13,1,ZD,M10,LENGTH=8),/))                         


3. Data Month may be a luxury to the requirement , because the same requirement should work for Weekly and monthly. For weekly it should Data week.

And finally thanks for all your help , It took me sometime understand the logic behind the code , it's very nice.
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12370
Topics: 75
Location: San Jose

PostPosted: Tue Jan 04, 2011 2:56 pm    Post subject: Reply with quote

mainframemouli wrote:
Thanks Kolusu , It worked.

Just did couple of modifications :
1. Replaced '//*' to '/*'


Mainframemouli,

Is that your shop standard?

mainframemouli wrote:

3. Data Month may be a luxury to the requirement , because the same requirement should work for Weekly and monthly. For weekly it should Data week.


You can get the month names(Jan - Dec) easily and the week range(Monday- Sunday dates) easily.
_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
mainframemouli
Beginner


Joined: 23 Feb 2005
Posts: 87
Topics: 33

PostPosted: Wed Jan 05, 2011 1:25 pm    Post subject: Reply with quote

Quote:

Is that your shop standard?


Yes , Kolusu , That's the shop standard.

Quote:

You can get the month names(Jan - Dec) easily and the week range(Monday- Sunday dates) easily.


What is required is
For Monthly : The Datamonth should be MM/YY (Like 01/11)
For weekly : The Dataweek should be Week Number for that year.
For example for 2011 Jan 1st week should be , WEEK 1
For example for 2011 Jan 2nd week should be , WEEK 2 etc...
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12370
Topics: 75
Location: San Jose

PostPosted: Wed Jan 05, 2011 2:06 pm    Post subject: Reply with quote

mainframemouli wrote:
What is required is
For Monthly : The Datamonth should be MM/YY (Like 01/11)
For weekly : The Dataweek should be Week Number for that year.
For example for 2011 Jan 1st week should be , WEEK 1
For example for 2011 Jan 2nd week should be , WEEK 2 etc...


mainframemouli,

Monthly part is easy. You can just add SYMNAMES DD name and use them in the Header3 like this
Code:

//SYMNAMES DD *
YNUM,S'&LYR2'
MNUM,S'&LMON'


and Header3 we will change this line to
Code:

50:'DATA MONTH : ',/),


to
Code:

50:'DATA MONTH : ',MNUM,'/',YNUM,/),


However the Week Number part is not that easy. There are different forms of calculating the week number depending on the region.

WEEK DB2 Scalar function

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DSNSQJ12/3.2.121

WEEK_ISO DB2 Scalar function

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DSNSQJ12/3.2.122

You can add a step to unload the Week number from SYSIBM.SYSDUMMY1 and create it as symbol which you can use it in the report.
Code:

SELECT CHAR('WEEK#,C')                       
      ,CHAR('''')                           
      ,CHAR(SMALLINT(WEEK(CURRENT DATE)))   
      ,CHAR('''')                           
      ,CHAR(' ',65)                         
  FROM SYSIBM.SYSDUMMY1;                     


This will create an 80 byte symbol as
Code:

WEEK#,C'2'


and you can use that in the header3 as

Code:

50:'DATA WEEK  : ',WEEK#,/),

_________________
Kolusu - DFSORT Development Team (IBM)
DFSORT is on the Web at:
www.ibm.com/storage/dfsort

www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group