MVSFORUMS.com Forum Index MVSFORUMS.com
A Community of and for MVS Professionals
 
 FAQFAQ   SearchSearch   Quick Manuals   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Extracting data using SORT

 
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities
View previous topic :: View next topic  
Author Message
sri50131
Beginner


Joined: 07 Oct 2004
Posts: 38
Topics: 15

PostPosted: Fri Dec 17, 2004 3:27 pm    Post subject: Extracting data using SORT Reply with quote

Hello,

I have a very large tape dataset that contains more than a million policies. I need to select 500 policies from this dataset. The problem that I have is a policy can have more than 1 occurance in the file. Is it possible to write a SORT to extract 500 policies not missing any occurance of any policy.

Thanks.
Back to top
View user's profile Send private message
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Fri Dec 17, 2004 3:44 pm    Post subject: Reply with quote

You need to supply more information about what you're doing.

Are the 500 policy numbers you want to select in a different file? What is the starting position, length and format of the policy numbers in this select file?

What is the starting position, length and format of the policy numbers in the main file?

What is the RECFM and LRECL of the select file and the main file?

If the 500 policy numbers are not in a file, then you need to describe where they come from.
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
sri50131
Beginner


Joined: 07 Oct 2004
Posts: 38
Topics: 15

PostPosted: Fri Dec 17, 2004 3:56 pm    Post subject: Reply with quote

Frank, I am sorry for not giving the relevant information.

The 500 policies are in the input file,which is the main file. Policy no is X(10) and starts at position is 15. The format of the file is VB, LRECL is 3964.

Thanks.
Back to top
View user's profile Send private message
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Fri Dec 17, 2004 4:25 pm    Post subject: Reply with quote

Huh? You said "I have a very large tape dataset that contains more than a million policies. I need to select 500 policies from this dataset.". So you have a main file with a million policies - that's the input file. What I'm asking is where you have the 500 policy numbers that you want to select. They have to be somewhere else other than the input file since the input file has a million policies. Are you going to hardcode the 500 policy numbers by hand or are you going to get them from somewhere like another file (a select file) or what?

At any rate, since the policy number a 10-byte CH field, you could actually use a DSORT INCLUDE statement with the 500 policy numbers as constants, something like:

Code:

   INCLUDE FORMAT=CH,
      COND=(19,10,EQ,C'0000000001',OR,
                  19,10,EQ,C'0000000004',OR,
                  ...
                  19,10,EQ,C'0000000800')


That would get you all of the occurrences for each policy number you code a constant for.

You could hardcode the constants or you could generate them dynamically from a file.

Note: Since your input file is VB, I assumed when you said the policy number starts at 15, that was without the RDW in positions 1-4, so I used 19 for the starting position. If 15 really did include the RDW, then use 15 instead of 19.
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12377
Topics: 75
Location: San Jose

PostPosted: Fri Dec 17, 2004 5:58 pm    Post subject: Reply with quote

Frank,

It seems Sri50131 has just one input file and he wants to extract the records for 500 policies from the million policies. How ever he cannot hard code the police numbers because he does not know them. As you mentioned he needs to generate the dynamic include cond's from the input file and then use them in the next pass.

Sri50131,

Here is an untested version of DFSORT/ICETOOL JCL for extracting the policy numbers from the file. If you have syncsort at your shop then change the pgm name to synctool. I assuming that your input file is sorted on the policy number which is at pos 19 ( 4 bytes of rdw).

First we add a seqnum to the file and create a report with the seqnum(last record for each policy) using the sections and trailer3 parms.

We take this file and now create a dynamic stopaft card by just reading the 500 th record.

We use that dynamic stopaft condition to extract the 500 policies from the input file.
Code:

//STEP0100 EXEC PGM=ICETOOL
//TOOLMSG  DD SYSOUT=*
//DFSMSG   DD SYSOUT=*
//IN       DD DSN=YOUR INPUT VB FILE,
//            DISP=SHR
//T1       DD DSN=&T1,DISP=(,PASS),SPACE=(CYL,(X,Y),RLSE)
//CTL3OUT  DD DSN=YOUR OUTPUT 500 POLICY FILE,
//            DISP=(NEW,CATLG,DELETE),
//            UNIT=SYSDA,
//            SPACE=(CYL,(X,Y),RLSE)
//TOOLIN   DD *
  COPY FROM(IN) USING(CTL1)
  COPY FROM(T1) USING(CTL2)
  COPY FROM(IN) USING(CTL3)
//CTL1CNTL DD *
  INREC FIELDS=(01,04,        $ RDW
                19,10         $ POLICY NUMBER   
                SEQNUM,8,ZD)  $ SEQ NUMBER 
  OUTFIL CONVERT,REMOVECC,NODETAIL,     
  SECTIONS=(5,10,                       
  TRAILER3=(15,8))
//CTL2CNTL DD *
  OUTFIL FNAMES=CTL3CNTL,STARTREC=500,ENDREC=500,
  OUTREC=(C' OPTION STOPAFT=',1,8,80:X)
//CTL3CNTL DD DSN=&C1,DISP=(,PASS),SPACE=(TRK,(1,1),RLSE)
/*


Hope this helps...

Cheers

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Fri Dec 17, 2004 7:25 pm    Post subject: Reply with quote

Kolusu,

I can't figure out what you're trying to do or which 500 policies you're trying to select or why. There are several bugs in your job (well, you did say it was untested). I tried to fix them and finally got a RC=0, but CTL3CNTL didn't have an OPTION STOPAFT=n record, so the output was the same million records as the input.

Quote:
It seems Sri50131 has just one input file and he wants to extract the records for 500 policies from the million policies. How ever he cannot hard code the police numbers because he does not know them. As you mentioned he needs to generate the dynamic include cond's from the input file and then use them in the next pass.


How can you generate something dynamic if you don't know what you want to generate? What "rule" are you trying to use to select the 500 policies for the INCLUDE condition? Are you thinking it has something to do with duplicates or what? Apparently, you're able to get a lot more meaning from sri50131's posts then I am. I'm really baffled here.
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Phantom
Data Mgmt Moderator
Data Mgmt Moderator


Joined: 07 Jan 2003
Posts: 1056
Topics: 91
Location: The Blue Planet

PostPosted: Fri Dec 17, 2004 11:56 pm    Post subject: Reply with quote

sri50131,

Please provide us clear information. You are making ppl guess.

Code:

The 500 policies are in the input file,which is the main file. Policy no is X(10) and starts at position is 15. The format of the file is VB, LRECL is 3964.


OK. Now, the main file contains millions of policies and the 500 you are referring to is a subset. Now HOW do u know which 500 policies you need to extract from the million records file ? Is it going to be the first 500 / last 500 / or do u have the list of 500 policies in a separate file ?

Also, tell us what SORT product you have. DFSORT / SYNCSORT.

If u have a list of 500 policies in a separate file and u have SYNCSORT, then u can't build a dynamic sort control card to INCLUDE the list of policies since SYNCSORT (as far as my knowledge) can support not more than 220 / 250 records in the dynamic control card. (KOLUSU Please correct me if I'm wrong. or does the LATER/LATEST version of syncsort supports more than 250 records ?)

Thanks,
Phantom
Back to top
View user's profile Send private message
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12377
Topics: 75
Location: San Jose

PostPosted: Sat Dec 18, 2004 8:31 am    Post subject: Reply with quote

Quote:

How can you generate something dynamic if you don't know what you want to generate? What "rule" are you trying to use to select the 500 policies for the INCLUDE condition? Are you thinking it has something to do with duplicates or what? Apparently, you're able to get a lot more meaning from sri50131's posts then I am. I'm really baffled here.


Frank,

Let me explain with an example. Sri said he had a file with million policies and he wants to pick up only 500 policies from the file. How ever he does not know them to hard code them as include conditions. He also cannot use the endrec and startrec features as the policies can have duplicates.

sample input file . Here the file has 5 polocies. Now I want to pick only 3 polocies(let's assume the first 3 policies.)

Code:

POLCIY -1
POLCIY -1
POLCIY -2
POLCIY -3
POLCIY -3
POLCIY -3
POLCIY -4
POLCIY -5
POLCIY -5
POLCIY -5


One way is to generate dynamic include cond cards and retrieve the desired records. But for syncsort you will get an excess cards error if the no: of control cards exceed 300 lines.

Now the other way is to get the record no: of the last record for each policy. This is where I use sections along with trailer3 to get the last record no: for each policy.

Now take this file and read just the 3rd record and create a dynamic stopaft card.

Code:

//STEP0100 EXEC PGM=ICETOOL                     
//TOOLMSG  DD SYSOUT=*                           
//DFSMSG   DD SYSOUT=*                           
//IN       DD *                                 
POLCIY -1                                       
POLCIY -1                                       
POLCIY -2                                       
POLCIY -3                                       
POLCIY -3                                       
POLCIY -3                                       
POLCIY -4                                       
POLCIY -5                                       
POLCIY -5                                       
POLCIY -5                                       
//T1       DD DSN=&T1,DISP=(,PASS),SPACE=(CYL,(X,Y),RLSE)
//CTL3CNTL DD DSN=&C1,DISP=(,PASS),SPACE=(TRK,(1,1),RLSE)
//CTL3OUT  DD SYSOUT=*                           
//TOOLIN   DD *                                 
  COPY FROM(IN) USING(CTL1)                     
  COPY FROM(T1) USING(CTL2)                     
  COPY FROM(IN) USING(CTL3)                     
//CTL1CNTL DD *                                 
  OUTREC FIELDS=(1,10,SEQNUM,8,ZD,80:X)         
  OUTFIL FNAMES=T1,REMOVECC,NODETAIL,           
  SECTIONS=(1,10,                               
  TRAILER3=(11,8))                               
//CTL2CNTL DD *                                 
  OUTFIL FNAMES=CTL3CNTL,STARTREC=3,ENDREC=3,   
  OUTREC=(C' OPTION STOPAFT=',1,8,80:X)         
/*


The output from the above job is

Code:

POLCIY -1
POLCIY -1
POLCIY -2
POLCIY -3
POLCIY -3
POLCIY -3



Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
kolusu
Site Admin
Site Admin


Joined: 26 Nov 2002
Posts: 12377
Topics: 75
Location: San Jose

PostPosted: Sat Dec 18, 2004 8:36 am    Post subject: Reply with quote

Quote:

KOLUSU Please correct me if I'm wrong. or does the LATER/LATEST version of syncsort supports more than 250 records ?)


Phantom,

You can have a max of 300 sysin cards for syncsort. That is the default installation limit. However you can override the mincore parameter and increase it to 500 , however I had trouble overriding it

Kolusu
_________________
Kolusu
www.linkedin.com/in/kolusu
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Sat Dec 18, 2004 11:25 am    Post subject: Reply with quote

Quote:
Now I want to pick only 3 polocies(let's assume the first 3 policies.)


Kolusu,

But what if he wants the first, fourth and fifth policies or the second, third and fifth policies. My point is that unless we know how he's picking the policies, we can't really come up with a good solution. We can certainly come up with any number of solutions based on various assumptions, but we don't know if our assumptions are correct. I thought you were somehow figuring out which policies he wanted based on what he said. That's what threw me. I think we still need to wait for him to tell us how he's picking the policies before offering a solution.

And if Syncsort has limitations that DFSORT doesn't have, then we also need to know which product he's using to come up with the best solution for him.

For the record, DFSORT can handle about 1629 INCLUDE conditions in a generated INCLUDE statement like this:

Code:

     INCLUDE FORMAT=CH,COND=(1,1,NE,1,1,OR, 
           19,10,EQ,C'0000000001',OR,             
           19,10,EQ,C'0000000003',OR,             
           ...
           19,10,EQ,C'0000003257',OR,   
           1,1,NE,1,1)               


The limiting factor is the number of conditions based on the length and format of the fields and constants. The number of records for the control statement is not a limiting factor for DFSORT.
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Sat Dec 18, 2004 12:06 pm    Post subject: Reply with quote

I should mention that with DFSORT's new IFTHEN clauses in INREC, OUTREC and OUTFIL, the limit for the number of ORed conditions is now much, much higher, because you can have multiple IFTHEN clauses each with a WHEN parameter that has the limit for the number of conditions and an OVERLAY or BUILD parameter that sets a flag if any of the conditions are true. Then you can have an OUTFIL INCLUDE that keeps the record only if the true flag is set. Here's a DFSORT job that shows the idea:

Code:

//MULT  EXEC  PGM=ICEMAN
//SYSOUT    DD  SYSOUT=*
//SORTIN DD DSN=...   (FB/80)
//SORTOUT DD DSN=...
//SYSIN    DD    *
  OPTION COPY
* If a1,OR,a2,OR,...,an is true,
* set flag in 81 to 'Y'.
  INREC IFTHEN=(WHEN=(a1,OR,a2,OR,...,an),
          OVERLAY=(81:C'Y')),
* If b1,OR,b2,OR,...,bn is true,
* set flag in 81 to 'Y'.
     IFTHEN=(WHEN=(b1,OR,b2,OR,...,bn),
       OVERLAY=(81:C'Y')),
* If none of the conditions is true,
* set flag in 81 to 'N'.
     IFTHEN=(WHEN=NONE,OVERLAY=(81:C'N'))
* If the flag in 81 is 'Y', include the
* records.
  OUTFIL INCLUDE=(81,1,CH,EQ,C'Y'),
* Remove the flag byte.
    OUTREC=(1,80)
/*


To use IFTHEN, you'll need z/OS DFSORT V1R5 PTF UQ95214 or DFSORT R14 PTF UQ95213 (Dec, 2004). For complete information on the new DFSORT and ICETOOL functions available with these PTFs, see:

http://www.ibm.com/servers/storage/support/software/sort/mvs/pdug/
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Phantom
Data Mgmt Moderator
Data Mgmt Moderator


Joined: 07 Jan 2003
Posts: 1056
Topics: 91
Location: The Blue Planet

PostPosted: Sun Dec 19, 2004 1:27 am    Post subject: Reply with quote

sri50131,

Please clarify your question. Already 11 replies have been posted for your thread without EXACTLY knowing what you are upto.

Kolusu,
Code:

You can have a max of 300 sysin cards for syncsort


I'm sitting at home now, so can't test this. But I remember getting the EXCESS CARDS error when my control card had around 220 - 230 records itself (with just 1 condition per line). Anyway, I'll try and get back to you. Now, I'have a question on this. Is this a limitation on the no. of lines in the control card or the no. of OR/AND conditions ? If it is just a limitation on the no. of lines, what will happen if I code like this :

Code:

  INCLUDE COND=(1,5,SS,EQ,C'ABCDE,12345,FGHIJ,67890,KLMNO',
                1,5,SS,EQ,C'15435,PQRST,38274,UVWXY,83940',
                .......,
                1,5,SS,EQ,C'ZHEJD,93624,XXXRE,34254,98394')   - 200th record


Here I used 'SS' operator to include more than 1 condition per record and I have 200 lines of such conditions. Will this work ?

I have one more question, when I use 'SS' do I have to give '1,5,SS,EQ' in every line or can I continue the operands on more than one line ? If yes, how ? (I'll try this in office tomorrow, but if you read my message today, kindly reply).

Thanks,
Phantom
Back to top
View user's profile Send private message
Frank Yaeger
Sort Forum Moderator
Sort Forum Moderator


Joined: 02 Dec 2002
Posts: 1618
Topics: 31
Location: San Jose

PostPosted: Sun Dec 19, 2004 11:13 am    Post subject: Reply with quote

Well, I can't help you with Syncsort's limitations, but I can tell you about SS in INCLUDE for DFSORT:

Code:

  INCLUDE COND=(1,5,SS,EQ,C'ABCDE,12345,FGHIJ,67890,KLMNO',
                1,5,SS,EQ,C'15435,PQRST,38274,UVWXY,83940',
                .......,
                1,5,SS,EQ,C'ZHEJD,93624,XXXRE,34254,98394')   - 200th record


This is invalid syntax. You need an OR between each condition, i.e.


Code:

  INCLUDE COND=(1,5,SS,EQ,C'ABCDE,12345,FGHIJ,67890,KLMNO',OR,
                1,5,SS,EQ,C'15435,PQRST,38274,UVWXY,83940',OR,
                .......,
                1,5,SS,EQ,C'ZHEJD,93624,XXXRE,34254,98394')   - 200th record


For DFSORT, each C'string' constant can have up to 256 characters and you can continue the string across lines in the usual way (code the string up to cc 71, put an * in cc 72 and continue the string on the next line anywhere between cc2 and cc16). You can also use FORMAT=SS,COND=(1,5,EQ,C'...',OR,...) to avoid coding SS every time.
_________________
Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic   printer-friendly view    MVSFORUMS.com Forum Index -> Utilities All times are GMT - 5 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


MVSFORUMS
Powered by phpBB © 2001, 2005 phpBB Group