MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Removing duplicate records based on the timestamp

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
yrcreddy
Beginner


Joined: 28 Dec 2005
Posts: 27
Topics: 12

PostPosted: Mon Aug 21, 2006 7:59 am    Post subject: Removing duplicate records based on the timestamp Reply with quote

Hi,

I have an input file which contains policyno,versionno,timestamp and some otherdata.The length of the input file is 1200 and is a fixed lenth formate.

The layout of the input file is:
PolicyNo ---x(10)--1 to 10
VersionNo --x(05)--11 to 15
TimeStamp ---x(26)--16 to 42
Some other data---------

I need to extract data based on the timestamp.


The input file records are as follow:
Code:

PolicyNo   VersionNo   Timestamp                some data
--------   ---------   ---------                ---------
Q12345        1       2006-08-19-11.44.43.36
Q34567        2       2006-05-02-11.44.13.23       
Q45678        3       2005-07-29-11.23.34.16       
Q78901        4       2005-07-29-11.23.34.16
Q01234        5       2005-07-29-11.23.34.16
Q98012        6       2005-08-29-11.23.34.16   
Q12345        1       2006-09-29-11.14.42.23
Q13456        1       2005-09-17-11.23.34.16       
Q45678       12       2005-09-17-11.23.34.16
Q98075       11       2005-09-17-11.23.34.16
Q78901        4       2005-09-20-11.23.34.16


There are two policies(Q12345& Q789014) repeted twice but the time stamp is differeant.I need to extract latest timestamp policy.
The order should not be changed.


The desired output is:
Code:

PolicyNo   VersionNo   Timestamp                some data
--------   ---------   ---------                ---------
Q34567         2       2006-05-02-11.44.13.23       
Q45678         3       2005-07-29-11.23.34.16       
Q78901         4       2005-07-29-11.23.34.16   
Q01234         5       2005-07-29-11.23.34.16
Q98012         6       2005-08-29-11.23.34.16   
Q12345         1       2006-09-29-11.14.42.23
Q13456         1       2005-09-17-11.23.34.16       
Q45678        12       2005-09-17-11.23.34.16
Q98075        11       2005-09-17-11.23.34.16
Q78901         4       2005-09-20-11.23.34.16
 


Can you please help me is there any way to do this by using DFSORT.
_________________
Thanks,
Ramachandra Reddy
Back to top
View user's profile Send private message
shekar123
Advanced


Joined: 22 Jul 2005
Posts: 528
Topics: 90
Location: Bangalore India

PostPosted: Mon Aug 21, 2006 8:52 am    Post subject: Reply with quote

ycreddy,

Is't that your output should be
Code:

Q34567 2  2006-05-02-11.44.13.23
Q45678 3  2005-07-29-11.23.34.16
Q01234 5  2005-07-29-11.23.34.16
Q98012 6  2005-08-29-11.23.34.16
Q12345 1  2006-09-29-11.14.42.23
Q13456 1  2005-09-17-11.23.34.16
Q45678 12 2005-09-17-11.23.34.16
Q98075 11 2005-09-17-11.23.34.16
Q78901 4  2005-09-20-11.23.34.16

instead of
Code:

Q34567 2  2006-05-02-11.44.13.23
Q45678 3  2005-07-29-11.23.34.16
Q78901 4  2005-07-29-11.23.34.16
Q01234 5  2005-07-29-11.23.34.16
Q98012 6  2005-08-29-11.23.34.16
Q12345 1  2006-09-29-11.14.42.23
Q13456 1  2005-09-17-11.23.34.16
Q45678 12 2005-09-17-11.23.34.16
Q98075 11 2005-09-17-11.23.34.16
Q78901 4  2005-09-20-11.23.34.16

_________________
Shekar
Grow Technically
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12375
Topics: 75
Location: San Jose

PostPosted: Mon Aug 21, 2006 10:06 am    Post subject: Reply with quote

yrcreddy,

try this job

Code:

//STEP0100 EXEC PGM=ICETOOL                           
//TOOLMSG   DD SYSOUT=*                               
//DFSMSG    DD SYSOUT=*                               
//IN        DD *                                       
Q12345        12006-08-19-11.44.43.36                 
Q34567        22006-05-02-11.44.13.23                 
Q45678        32005-07-29-11.23.34.16                 
Q78901        42005-07-29-11.23.34.16                 
Q01234        52005-07-29-11.23.34.16                 
Q98012        62005-08-29-11.23.34.16                 
Q12345        12006-09-29-11.14.42.23                 
Q13456        12005-09-17-11.23.34.16                 
Q45678       122005-09-17-11.23.34.16                 
Q98075       112005-09-17-11.23.34.16                 
Q78901        42005-09-20-11.23.34.16                 
//TOOLIN    DD *                                       
  SELECT FROM(IN) TO(OUT) ON(1,10,CH) FIRST USING(CTL1)
//OUT       DD SYSOUT=*                               
//CTL1CNTL  DD *                                       
  SORT FIELDS=(01,10,CH,A,                             
               16,26,CH,D)                             
/*


Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
yrcreddy
Beginner


Joined: 28 Dec 2005
Posts: 27
Topics: 12

PostPosted: Mon Aug 21, 2006 9:57 pm    Post subject: Reply with quote

Hi kolusu,


Thank you very much for your response.The records order is changed accoding to the above jcl.In my desired output the records order should not be changed.

One more thing i forgot to specify,i want to extract only the records which are having same policy,version and diferrent timestamp.

The output of the above job is:

Q01234 5 2005-07-29-11.23.34.16
Q12345 1 2006-09-29-11.14.42.23
Q13456 1 2005-09-17-11.23.34.16
Q34567 2 2006-05-02-11.44.13.23
Q45678 12 2005-09-17-11.23.34.16
Q78901 4 2005-09-20-11.23.34.16
Q98012 6 2005-08-29-11.23.34.16
Q98075 11 2005-09-17-11.23.34.16


Desired output is:
Q34567 2 2006-05-02-11.44.13.23
Q45678 3 2005-07-29-11.23.34.16
Q78901 4 2005-07-29-11.23.34.16
Q01234 5 2005-07-29-11.23.34.16
Q98012 6 2005-08-29-11.23.34.16
Q12345 1 2006-09-29-11.14.42.23
Q13456 1 2005-09-17-11.23.34.16
Q45678 12 2005-09-17-11.23.34.16
Q98075 11 2005-09-17-11.23.34.16
Q78901 4 2005-09-20-11.23.34.16

For example:
In my first post there are two policies(Q12345&Q78901) having same version no with diferrent timestamp.The policy Q45678 has repeted twice with diferrent version no.i want to retain these records but the order should not be changed.


Can you please help me how to do it.
_________________
Thanks,
Ramachandra Reddy
Back to top
View user's profile Send private message
coolman
Intermediate


Joined: 03 Jan 2003
Posts: 283
Topics: 27
Location: US

PostPosted: Tue Aug 22, 2006 7:56 am    Post subject: Reply with quote

yrcreddy,

Your inputs and outputs are inconsistent - Take, Q12345, there are 2 records with version nos 1 and 1 - You have retained the record with has the latest timestamp 2006-09-29-11.14.42.23 whereas for Q78901 you have both the records in the output even though they have the same version

Please post a correct set of inputs and outputs inorder for us to suggest you a solution.
________
Vittorio Jano


Last edited by coolman on Sat Feb 05, 2011 1:51 am; edited 1 time in total
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12375
Topics: 75
Location: San Jose

PostPosted: Tue Aug 22, 2006 8:08 am    Post subject: Reply with quote

ycrreddy,

Try this job. I assumed that your input is 80 bytes LRECL and is of FB recfm. However your specs are confusing. You say you want to pick the latest timestamped record and yet you say the order should not be changed. This does not work for Policy no: Q78901 as it has 2 records with the same version no (record # 4 & 11) .record # 11 has the latest timestamp. Now if we are to retain the order then Q78901 will be at the bottom where as you show it as record #4.

Code:

//STEP0100 EXEC PGM=SORT                                 
//SYSOUT   DD SYSOUT=*                                   
//SORTIN   DD *                                         
Q12345        12006-08-19-11.44.43.36                   
Q34567        22006-05-02-11.44.13.23                   
Q45678        32005-07-29-11.23.34.16                   
Q78901        42005-07-29-11.23.34.16                   
Q01234        52005-07-29-11.23.34.16                   
Q98012        62005-08-29-11.23.34.16                   
Q12345        12006-09-29-11.14.42.23                   
Q13456        12005-09-17-11.23.34.16                   
Q45678       122005-09-17-11.23.34.16                   
Q98075       112005-09-17-11.23.34.16                   
Q78901        42005-09-20-11.23.34.16                   
//SORTOUT  DD DSN=&T1,DISP=(,PASS),SPACE=(CYL,(1,1),RLSE)
//SYSIN    DD *                                         
  INREC FIELDS=(01,80,SEQNUM,8,ZD)                       
  SORT FIELDS=(01,10,CH,A,                               
               11,05,CH,A,                               
               16,26,CH,A)                               
  OUTFIL REMOVECC,NODETAIL,                             
  SECTIONS=(1,15,TRAILER3=(01,88))                       
//*                                                     
//STEP0200 EXEC PGM=SORT                                 
//SYSOUT   DD SYSOUT=*                                   
//SORTIN   DD DSN=&T1,DISP=SHR                           
//SORTOUT  DD SYSOUT=*                                   
//SYSIN    DD *                                         
  SORT FIELDS=(81,08,CH,A)                               
  OUTREC FIELDS=(01,80)                                 
/*


Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
yrcreddy
Beginner


Joined: 28 Dec 2005
Posts: 27
Topics: 12

PostPosted: Tue Aug 22, 2006 11:58 pm    Post subject: Reply with quote

Thank you very much kolusu.
This jcl is working for me what i expected.
_________________
Thanks,
Ramachandra Reddy
Back to top
View user's profile Send private message
jayram99
Beginner


Joined: 16 Aug 2004
Posts: 52
Topics: 21
Location: falls church.va,usa

PostPosted: Tue Oct 10, 2006 2:56 pm    Post subject: Reply with quote

Thanks Kolusu,

I have the following question, How do we get the following result.

1. the order of the policy number should not be changed and the record should be latest record based on timestamp and also removing the duplicates.

2. The order of the policy number should be in ascending order and the record should be latest record based on timestamp and also removing the duplicates.

Thanks in advance.
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12375
Topics: 75
Location: San Jose

PostPosted: Tue Oct 10, 2006 3:01 pm    Post subject: Reply with quote

jayram99,

Did you try the job posted here in this topic ?

http://www.mvsforums.com/helpboards/viewtopic.php?p=33370#33370

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
jayram99
Beginner


Joined: 16 Aug 2004
Posts: 52
Topics: 21
Location: falls church.va,usa

PostPosted: Tue Oct 10, 2006 3:16 pm    Post subject: Reply with quote

Thanks Kolusu,

We can't use the ICETOOL utility but i got the result by using SYNCSORT utility by sorting on Policy number and Timestamp and in the next step removed the duplicates.

Thanks again.
Back to top
View user's profile Send private message Yahoo Messenger MSN Messenger
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12375
Topics: 75
Location: San Jose

PostPosted: Tue Oct 10, 2006 3:37 pm    Post subject: Reply with quote

jayram99,

run the job by changing pgm=icetool to pgm=synctool and see if it works

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12375
Topics: 75
Location: San Jose

PostPosted: Tue Oct 10, 2006 3:43 pm    Post subject: Reply with quote

Jayaram99,

If you still get an error try this job.

Code:

//STEP0100 EXEC PGM=SORT                 
//SYSOUT    DD SYSOUT=*                 
//SORTIN    DD *                         
Q12345        12006-08-19-11.44.43.36   
Q34567        22006-05-02-11.44.13.23   
Q45678        32005-07-29-11.23.34.16   
Q78901        42005-07-29-11.23.34.16   
Q01234        52005-07-29-11.23.34.16   
Q98012        62005-08-29-11.23.34.16   
Q12345        12006-09-29-11.14.42.23   
Q13456        12005-09-17-11.23.34.16   
Q45678       122005-09-17-11.23.34.16   
Q98075       112005-09-17-11.23.34.16   
Q78901        42005-09-20-11.23.34.16   
//SORTOUT   DD SYSOUT=*                 
//SYSIN     DD *                         
  SORT FIELDS=(01,10,CH,A,               
               16,26,CH,D)               
                                         
  OUTFIL REMOVECC,NODETAIL,             
  SECTIONS=(01,10,                       
  HEADER3=(01,80))                       
/*   


Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group