View previous topic :: View next topic |
Author |
Message |
videlord Beginner
Joined: 09 Dec 2004 Posts: 147 Topics: 19
|
Posted: Sun Mar 27, 2005 9:46 pm Post subject: Merge different fields of several input files to one ouput |
|
|
There are n input files, each file with m records.
Is there a way to combine nth field of the nth input file to one output file?
For example:
input1:
1 aaa xxx xxx
2 bbb xxx xxx
3 ccc xxx xxx
input2:
1 xxx 111 xxx
2 xxx 222 xxx
3 xxx 333 xxx
input3:
1 xxx xxx AAA
2 xxx xxx BBB
3 xxx xxx CCC
the output will be:
1 aaa 111 AAA
2 bbb 222 BBB
3 ccc 333 CCC
Using ICETOL "splice witheach", we can get the result.
It will sort n * m records by seqnum.
If m is too large, it will take a long long time.
Is there an more efficient way?
Thanks. |
|
Back to top |
|
 |
kolusu Site Admin

Joined: 26 Nov 2002 Posts: 12378 Topics: 75 Location: San Jose
|
Posted: Mon Mar 28, 2005 9:03 am Post subject: |
|
|
videlord,
You can use Easytrieve to acehieve the desired results and it can done in one pass of the data.
Hope this helps...
Cheers
Kolusu _________________ Kolusu
www.linkedin.com/in/kolusu |
|
Back to top |
|
 |
Frank Yaeger Sort Forum Moderator

Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
|
Posted: Mon Mar 28, 2005 11:22 am Post subject: |
|
|
Quote: | Using ICETOOL "splice witheach", we can get the result.
It will sort n * m records by seqnum.
If m is too large, it will take a long long time. |
Did you actually run a job that took a "long long time". If so, which sort product were you using, what was the RECFM and LRECL of the input files, how many records did each file have, and how "long" did it take?
Or are you just assuming that it will take a "long long time". If so, then you might be surprised. _________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort |
|
Back to top |
|
 |
videlord Beginner
Joined: 09 Dec 2004 Posts: 147 Topics: 19
|
Posted: Tue Mar 29, 2005 5:05 am Post subject: |
|
|
Thanks kolusu.
I will search about Easytrieve. But if it's not a product of IBM, I will not use it.
Frank,
I'm still testing the logic.
We would have 600 more millions record, and total 17 files need to be processed. I think it will take a long time.
Fisrt, expand each file to same format (Add space)
Then SPLICE 600m * 17 records.
I'm writing a PL/I program to compare with DF/SORT.
I will post the testing result later. |
|
Back to top |
|
 |
Mervyn Moderator

Joined: 02 Dec 2002 Posts: 415 Topics: 6 Location: Hove, England
|
Posted: Tue Mar 29, 2005 8:42 am Post subject: |
|
|
My money's on DFSORT  _________________ The day you stop learning the dinosaur becomes extinct |
|
Back to top |
|
 |
Frank Yaeger Sort Forum Moderator

Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
|
Posted: Tue Mar 29, 2005 11:24 am Post subject: |
|
|
videlord,
I assumed from your example that you already had the files set up for splicing. If you need to do the additional COPY runs to get them set up for splicing, that will certainly affect the total time required.
If you're going to do a performance comparison, I'd like to see the DFSORT/ICETOOL job you use to make sure it's coded "correctly". _________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort |
|
Back to top |
|
 |
kolusu Site Admin

Joined: 26 Nov 2002 Posts: 12378 Topics: 75 Location: San Jose
|
Posted: Tue Mar 29, 2005 12:27 pm Post subject: |
|
|
Meryvn,
If the files are not setup for splicing , then I think a program will be a best option as the merging is done in one pass.
Kolusu _________________ Kolusu
www.linkedin.com/in/kolusu |
|
Back to top |
|
 |
videlord Beginner
Joined: 09 Dec 2004 Posts: 147 Topics: 19
|
Posted: Tue Mar 29, 2005 12:33 pm Post subject: |
|
|
Frank,
Actually, the input files are generated from previous DFSORT jobs.
If I use SPLICE next step, I will expand the fields padded with space.
If I use PL/I program, then only sequenc number and one field needed in each files.
ICETOOL stament:
TOOLIN:
SPLICE FROM(CONCT) TO(OUT) ON(1,9,CH) WITHEACH -
WITH(xx,xx) ... WITH(xx,xx) USING(CTL1)
CTL1CNTL:
OPTION EQUALS
OUTFIL FNAMES=OUT,OUREC=(1,xxx) |
|
Back to top |
|
 |
Frank Yaeger Sort Forum Moderator

Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
|
Posted: Tue Mar 29, 2005 1:08 pm Post subject: |
|
|
So you are only comparing the SPLICE operator to the PL/I program, and not the multiple COPY operators + the SPLICE operator to the PL/I program ... right?
Are you going to have your PL/I program avoid sorting by reading one record from each file in turn? If so, then it will have an advantage over SPLICE which has to sort the concatenated input files. (I'd like to allow SPLICE to work without sorting, when appropriate, in the future, but it can't do that now.)
You don't need OPTION EQUALS ... SPLICE uses it automatically.
"OUREC" should be "OUTREC". _________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort |
|
Back to top |
|
 |
Frank Yaeger Sort Forum Moderator

Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
|
Posted: Tue Mar 29, 2005 1:31 pm Post subject: |
|
|
Hmmm ... why do you need the OUTFIL statement? If you pad all of the input files to a length of xxx, then the output file will automatically have a length of xxx, so an OUTFIL with OUTREC=(1,xxx) is NOT needed. Thus, you should be able to remove the USING(CTL1) and the //CTL1CNTL DD.
Or am I missing the reason for the OUTREC=(1,xxx) parameter? _________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort |
|
Back to top |
|
 |
videlord Beginner
Joined: 09 Dec 2004 Posts: 147 Topics: 19
|
Posted: Tue Mar 29, 2005 1:40 pm Post subject: |
|
|
Frank,
Yes, compare SPLICE only.
The input files are sorted already. No sort needed in PL/I program. |
|
Back to top |
|
 |
videlord Beginner
Joined: 09 Dec 2004 Posts: 147 Topics: 19
|
Posted: Tue Mar 29, 2005 2:02 pm Post subject: |
|
|
Yes, Frank, the CNTL can be omitted. Thanks.
And one more question:
SPLICE WITHEACH
Can I replace 2 fields of one file?
For example:
BASE ON1
ON1 WITH1
ON1 WITH2A WITH2B
ON1 WITH3
result:
BASE ON1 WITH1 WITH2A WITH3 WITH2B |
|
Back to top |
|
 |
Frank Yaeger Sort Forum Moderator

Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
|
Posted: Tue Mar 29, 2005 2:03 pm Post subject: |
|
|
Quote: | The input files are sorted already. No sort needed in PL/I program. |
Then I would be pleasantly surprised if SPLICE was faster than the PL/I program, since a merge (for the PL/I program) is generally faster than a sort (for SPLICE). _________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort |
|
Back to top |
|
 |
videlord Beginner
Joined: 09 Dec 2004 Posts: 147 Topics: 19
|
Posted: Wed Mar 30, 2005 5:47 am Post subject: |
|
|
I tested input files with 7M records each, the result shows SPLICE is faster!!!
Code: |
EXCP CPU SRB CLOCK SERV
DFSORT/SPLICE 159K 4.52 .22 57.08 2471K
PL/I 11200K 5.85 2.64 264.57 15333K
|
I think READ of my PL/I source is not effective.
It read records of each input files one by one. |
|
Back to top |
|
 |
Frank Yaeger Sort Forum Moderator

Joined: 02 Dec 2002 Posts: 1618 Topics: 31 Location: San Jose
|
Posted: Wed Mar 30, 2005 11:08 am Post subject: |
|
|
Quote: | And one more question:
SPLICE WITHEACH
Can I replace 2 fields of one file?
For example:
BASE ON1
ON1 WITH1
ON1 WITH2A WITH2B
ON1 WITH3
result:
BASE ON1 WITH1 WITH2A WITH3 WITH2B |
If WITH2A and WITH2B are contiguous, then just use one WITH field for them, e.g. WITH(5,13). If they are not contiguous, then rearrange the fields so they are contiguous (e.g. copy them to the end of the record as contiguous fields). _________________ Frank Yaeger - DFSORT Development Team (IBM)
Specialties: JOINKEYS, FINDREP, WHEN=GROUP, ICETOOL, Symbols, Migration
DFSORT is on the Web at:
www.ibm.com/storage/dfsort |
|
Back to top |
|
 |
|
|