MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

How to count duplicate records

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
chandra_vssv
Beginner


Joined: 08 May 2005
Posts: 26
Topics: 12
Location: pune

PostPosted: Thu Mar 09, 2006 8:52 am    Post subject: How to count duplicate records Reply with quote

Hi,

I have inputfile of rec length of 80.I have to count the total number of records as well as number of duplicate records.Could anybody please provide a solution either in syncsort or cobol program.

Input file:
Code:

======
CODE1
CODE1
CODE1
CODE2
CODE3
CODE3
CODE3


Outputfile:
Code:

=======
CODE1    3
CODE2    1
CODE3    3
TOTAL    7

_________________
Regards,
Chandu
Back to top
View user's profile Send private message Yahoo Messenger
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12369
Topics: 75
Location: San Jose

PostPosted: Thu Mar 09, 2006 9:14 am    Post subject: Reply with quote

chandra_vssv,

Try this job

Code:

//STEP0100 EXEC PGM=SORT       
//SYSOUT   DD SYSOUT=*         
//SORTIN   DD *               
CODE1                         
CODE1                         
CODE1                         
CODE2                         
CODE3                         
CODE3                         
CODE3                         
//SORTOUT  DD SYSOUT=*         
//SYSIN    DD *               
  SORT FIELDS=(1,5,CH,A)       
  OUTFIL REMOVECC,NODETAIL,   
  SECTIONS=(1,5,               
  TRAILER3=(1,5,2X,COUNT)),   
  TRAILER1=(C'TOTAL :',COUNT) 
/*                             


Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
ynb
Beginner


Joined: 09 Mar 2006
Posts: 1
Topics: 0

PostPosted: Thu Mar 09, 2006 10:18 am    Post subject: Reply with quote

Hi,

In additon to the below query posed. I have a necessity to sum up the amounts of all duplicate records along with its occurences.Can anyone please provide me a solution.


Input file:
======
CODE1 200.0
CODE1 100.0
CODE1 300.0
CODE2 200.0
CODE3 100.0
CODE3 400.0
CODE3 200.0


Outputfile:
=======
CODE1 3 600.0
CODE2 1 200.0
CODE3 3 700.0
TOTAL 7
_________________
Thanks,
Basha
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12369
Topics: 75
Location: San Jose

PostPosted: Thu Mar 09, 2006 10:29 am    Post subject: Reply with quote

ynb,

Try this

Code:

//STEP0100 EXEC PGM=SORT                                   
//SYSOUT   DD SYSOUT=*                                     
//SORTIN   DD *                                             
CODE1 200.0                                                 
CODE1 100.0                                                 
CODE1 300.0                                                 
CODE2 200.0                                                 
CODE3 100.0                                                 
CODE3 400.0                                                 
CODE3 200.0                                                 
//SORTOUT  DD SYSOUT=*                                     
//SYSIN    DD *                                             
  SORT FIELDS=(1,5,CH,A)                                   
  OUTREC FIELDS=(1,6,7,3,11,1,80:X)                         
  OUTFIL REMOVECC,NODETAIL,                                 
  SECTIONS=(1,5,                                           
  TRAILER3=(1,5,2X,COUNT,TOT=(7,4,ZD,EDIT=(IIIIIIIT.T)))), 
  TRAILER1=(C'TOTAL :',COUNT,TOT=(7,4,ZD,EDIT=(IIIIIIIT.T)))
/*                                                         


Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
hikernh
Beginner


Joined: 04 Apr 2006
Posts: 1
Topics: 0
Location: concord new hampshire

PostPosted: Tue Apr 04, 2006 10:24 am    Post subject: Reply with quote

what does removecc and nodetail mean. im trying to understand how that sort works exactly to perform the task.
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12369
Topics: 75
Location: San Jose

PostPosted: Tue Apr 04, 2006 10:47 am    Post subject: Reply with quote

Hikernh,

REMOVECC can be used to remove the ANSI control characters from a report.For ex: if you have 80 byte FB input dataset and when you use reporting features like header/trailer1, then sort automatically will create the output dataset with FBA and the LRECL would be 81. The extra 1 byte is for the ansi control character which is at the 1st byte.

Removecc suprress the creation of carriage control chararacter. So using it will make both the input and output have the same DCB parameters when you are using reporting features.

NODETAIL is used to Specify that data records are not to be written out for the reports produced for this OUTFIL group. With NODETAIL, the data records are completely processed with respect to input fields, statistics, counts, sections breaks, and so on, but are not written to the OUTFIL data set and are not included in line counts for determining the end of a page. You can use NODETAIL to summarize the data records without actually showing them

Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
pzmohanty
Beginner


Joined: 20 May 2004
Posts: 97
Topics: 43
Location: hyderabad, India

PostPosted: Thu Apr 06, 2006 12:03 pm    Post subject: Reply with quote

hi hikernh,

here is another way of achieving the same goal.

Code:
//STEP1    EXEC PGM=SORT                                     
//SYSOUT   DD SYSOUT=*                                       
//SORTIN   DD *                                               
CODE1 200                                                     
CODE1 100                                                     
CODE1 300                                                     
CODE2 200                                                     
CODE3 100                                                     
CODE3 400                                                     
CODE3 200                                                     
//SORTOUT  DD SYSOUT=*                                       
//SYSIN    DD *                                               
 INREC FIELDS=(1,9,C'01')                                     
 SORT FIELDS=(1,5,CH,A)                                       
 SUM FIELDS=(7,3,ZD,10,2,ZD)                                 
 OUTREC FIELDS=(1,6,X,10,2,ZD,EDIT=(IT),X,7,3,ZD,EDIT=(IIT)) 
/*

_________________
Priya Ranjan Mohanty
Consultant
Kanbay Software (I) pvt. Ltd.
Hyderabad
Back to top
View user's profile Send private message Send e-mail Yahoo Messenger
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12369
Topics: 75
Location: San Jose

PostPosted: Thu Apr 06, 2006 12:08 pm    Post subject: Reply with quote

pzMohanty,

What happens if you have more than 10 duplicates per key? ex: run your job with this sample data
Code:

CODE1 200
CODE1 100
CODE1 300
CODE1 300
CODE1 300
CODE1 300
CODE1 300
CODE1 300
CODE1 300
CODE1 300
CODE1 300
CODE2 200
CODE3 100
CODE3 400
CODE3 200


Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
pzmohanty
Beginner


Joined: 20 May 2004
Posts: 97
Topics: 43
Location: hyderabad, India

PostPosted: Thu Apr 06, 2006 12:32 pm    Post subject: Reply with quote

Hi Kolusu,

Yeah , I got your point .

Now , I clearly understood why it is preferable to use TRAILER option instead of SUM FIELDS.

Thanks a lot
_________________
Priya Ranjan Mohanty
Consultant
Kanbay Software (I) pvt. Ltd.
Hyderabad
Back to top
View user's profile Send private message Send e-mail Yahoo Messenger
surendar
Beginner


Joined: 14 Apr 2006
Posts: 4
Topics: 1
Location: chennai

PostPosted: Fri Apr 14, 2006 5:13 am    Post subject: SUM of Duplicate fields Reply with quote

Hi,

TRAILER3=(1,5,2X,COUNT,TOT=(7,4,ZD,EDIT=(IIIIIIIT.T)))),
TRAILER1=(C'TOTAL :',COUNT,TOT=(7,4,ZD,EDIT=(IIIIIIIT.T)))


Can u guys please let me know How exactly this works ( An Overview)
Am bit confussed.
Whatz d difference in using SUMFILEDS and TRAILER option
Whatz iiiii.t??

Surendar Shocked
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12369
Topics: 75
Location: San Jose

PostPosted: Fri Apr 14, 2006 5:34 am    Post subject: Reply with quote

surendar,

Check this link to "z/OS DFSORT Application Programming Guide" which explains in detail about the trailer parm on OUTFIL

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ICE1CA10/3.13?DT=20050222160456

Quote:

Whatz d difference in using SUMFILEDS and TRAILER option
Whatz iiiii.t??


The difference is that when using sum fields you are restricted by the size of the summing field. In the above example mohanty is summing on 3 bytes and the max number it can hold is 999. Once the total is more than that sort writes it as a seperate record. So you would get incorrect results.

However with trailer option it would expand the field internally and does the summation for you and there by giving you the correct results.

If you want to understand more run the jobs shown by mohanty with the test data I supplied in this topic.

Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
surendar
Beginner


Joined: 14 Apr 2006
Posts: 4
Topics: 1
Location: chennai

PostPosted: Sun Apr 16, 2006 1:45 am    Post subject: Reply with quote

Thanks Kolusu... Laughing
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group