Posted: Fri Oct 19, 2007 1:45 pm Post subject: Check for the records in same file
Hi guys,
I would like to know if we can check whether for a particular record, it's corresponding record is present in the file or not.
say, My input file is:
The key here is first 7 bytes.
REC1: 1234567abcd .....sth sth....
Normally, in the input file each record having key and then from 8-11 byte as "abcd", it will also have another record having the same key and 8-11 byte as "efgh".
This normally should happen in input file. I want to check records if this does not happen.
Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
Posted: Fri Oct 19, 2007 2:33 pm Post subject:
Your description of your requirement is not all that clear. It would have helped if you'd shown an example of your input records and what you expect for output. However, I'm guessing from what you've said that you want the output to contain any records that are not duplicates. If so, you can use a DFSORT/ICETOOL job like this:
If that's not what you want, then you need to explain more clearly what you do want with a good example of input and output. _________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
That means, in the input file every record with "abcd" in 8-11 bytes will have some record (with the same key) with "efgh" in 8-11 bytes. In this case, the records 1 and 3 match this, so no problem.
Similarly, for a record with "pqrs", there has to be another record with "stuv" (with its same key). In this case it's not there. So, this record has to go to output file. In case if there was one more record like below, then there was no problem.
Code:
2345678stuv...sth...sth.......
So, I want to find all such records whose corresponding records are not present.
fyi. these combinations like below are just few (say 5 to 6):
Code:
abcd efgh
pqrs stuv
...
I hope I am more clear this time.
Thanks a lot for your time. _________________ Thanks.
Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
Posted: Fri Oct 19, 2007 3:41 pm Post subject:
Please list all of the combinations.
Can there be records without any of the values in the combinations? For example, if there were only the two combinations you showed, would every record have abcd, pqrs, efgh or stuv, or could some records have another value like aaaa. If so, what would you want to do with those other records?
Also, what is the RECFM and LRECL of the input file?
It would help if you showed a better example of input records and expected output records with more of the possible variations. _________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
Posted: Fri Oct 19, 2007 5:16 pm Post subject:
For the simple input case you show, an ICETOOL SELECT with NODUPS operator would get you the output you show. I suspect that won't work because there's more to it than that.
Quote:
It would help if you showed a better example of input records and expected output records with more of the possible variations.
I asked for that because I don't think you're covering all of the variations.
Example: would these be possible input records (same key but abcd and stuv instead of abcd and efgh or prqs and stuv)? If so, would you want them both in the output file?
Another example: Can there be more than 2 records with the same key? What would those variations look like and what would you want for output. _________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
//GETMATCH EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN1 DD *
----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
1111111ABCD HAS MATCH
1111111EFGH HAS MATCH
1111111HIJK HAS MATCH
2222222PQRS NO MATCH
1111111LMNO HAS MATCH
/*
//OUT DD SYSOUT=*
//TMP1 DD DSN=&&TEMP1,DISP=(MOD,PASS),SPACE=(TRK,(5,5)),UNIT=SYSDA
//TOOLIN DD *
SELECT FROM(IN1) TO(OUT) ON(1,7,CH) ON(81,1,CH) NODUPS-
USING(CP01)
/*
//CP01CNTL DD *
INREC IFTHEN=(WHEN=(8,4,CH,EQ,C'ABCD',|,8,4,CH,EQ,C'EFGH'),
OVERLAY=(81:C'1')),
IFTHEN=(WHEN=(8,4,CH,EQ,C'HIJK',|,8,4,CH,EQ,C'LMNO'),
OVERLAY=(81:C'2')),
IFTHEN=(WHEN=(8,4,CH,EQ,C'PQRS',|,8,4,CH,EQ,C'TUVW'),
OVERLAY=(81:C'3'))
OUTFIL BUILD=(1,80)
//*
OUT contains:
Code:
2222222PQRS NO MATCH
In the above example if the record
Code:
2222222PQRS
had another record like
Code:
2222222TUVW
then you wouldn't need any records in the o/p right!?
If this is not what you are expecting then as Frank is asking give some set of example's showing good combinations of i/p and o/p. _________________ cHEERs
krisprems
Our company data is confidential and the input file is huge, hence I was always averting to send the real data and trying to create a dummy data which would as fine represent my situation. I again apologise to keep you guys guessing.
and say this is second input file:(let's be simple. just two records)
Code:
abcd efgh
pqrs stuv
That means, in the input file1 if a record contains "abcd" in 8-11 bytes, then there should be another record in input file1 with the same key (key is 1-7 bytes) but with "efgh" in 8-11 bytes and vice-versa also. That means, the records have to be in pair. If not, they have to go to output file.
Similarly, if there's a record with "pqrs", then there should be another record with "stuv". if not, go to output file.
Now, there can be a possibility that there is a record with "abcd" and another record with "stuv" with the same key. This means, both should come to the output file because for "abcd", it's matching is "efgh" (which is not present) and for "stuv" it's matching is "pqrs" (which is also not present).
Also, there can be a possibility, that input file1 contains only four records. all with the same key and then in 8-11 bytes as "abcd", "efgh", "pqrs", "stuv". This is perfectly fine. none should go to output file.
solution:
I believe SELECT would not work here. because select would just find for the duplicates, non-duplicates for a key plus whether a particular string (in this case "abcd", "efgh", etc...) are present or not. It won't find whether for "abcd", "efgh" is present or not. and similarly for "pqrs", "stuv" is present or not.
I can just think of using SPLICE with WITHALL. but not finding how to proceed after that. _________________ Thanks.
Joined: 13 Dec 2006 Posts: 101 Topics: 4 Location: india
Posted: Sat Oct 20, 2007 10:08 pm Post subject:
Quote:
Our company data is confidential and the input file is huge,
If you can send it through offline, send it to Frank.
Quote:
I was always averting to send the real data and trying to create a dummy data which would as fine represent my situation
In the Forum whom so ever is posting there data is confidential and they post an example of that, but again and again if you are posting example with only 2-3 records its difficult for us to assume the situation.
Again reading your description, doesn't sound any new. Its Ok.
Would like to ask some questions:
1. Do you have duplicates in the position 1-7
2. Did you try frank's and my solution, any comments?
3. In the example that i had considered
Code:
1111111ABCD HAS MATCH
1111111EFGH HAS MATCH
1111111HIJK HAS MATCH
2222222PQRS NO MATCH
1111111LMNO HAS MATCH
I have considered
Match for ABCD as EFGH
Match for HIJK as LMNO
Match for PQRS as TUVW
Since the corresponding key for PQRS that is TUVW was not present in the i/p file , PQRS was printed in the o/p file.
Quote:
let's be simple. just two records
Don't be simple be complex, helps is knowing all the combinations/variations of your i/p _________________ cHEERs
krisprems
Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
Posted: Sun Oct 21, 2007 10:46 am Post subject:
Quote:
Our company data is confidential and the input file is huge,
If you can send it through offline, send it to Frank.
Krisprems,
Please don't presume to tell anyone to send ME confidential data. I don't want anyone's confidential data!
seekaysk,
I don't need to see your actual data. I just need to see made up values that show the relevant variations. Three input records just isn't enough to make it clear what you want.
Quote:
and say this is second input file:(let's be simple. just two records)
seekaysk,
I don't recall you saying there was a second input file before. I thought the pairs were just hardcoded. What is the RECFM and LRECL of this second input file? What is the approx. maximum number of records in this second input file? _________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum