Posted: Thu Sep 23, 2010 5:22 am Post subject: Removing records with junk chars
Hi Experts,
I have a fixed length file and I have some free form text between the column numbers: 10 and 246. I have a file with more than a million records. My requirement is to identify those records that have junk characters ( that is, characters other than A-Z and 0-9) preferably through DFSORT.
So far, I have tried using SS option but I could only filter out low-values using it...but not other junk characters.
I had used:
Code:
INCLUDE COND = (10,246,SS,EQ,X'00')
I have searched posts that mention about ALTSEQ but I think this function is for converting from one character to another. I do not want to convert any data.
Thanks for the solution. I tried the same to process my junk data. I see a small issue. I have shown the Input data here and output data below it. As we observe that all the records in the input are junk and hence all these should be reported in the output but this is not happening.
Also,
there is a small correction, the record length is 272 and the column range is : 10 thru 254. Accordingly, I have modified my code as
Note that pipe "|" is being used a column delimiter and its boundaries is fixed..though it may not appear so especially for the boundary between the second and third column
Joined: 26 Nov 2002 Posts: 12378 Topics: 75 Location: San Jose
Posted: Sat Sep 25, 2010 2:27 pm Post subject:
Quote:
I see a small issue. I have shown the Input data here and output data below it. As we observe that all the records in the input are junk and hence all these should be reported in the output but this is not happening.
It is NOT my fault that you couldn't modify the job I gave you as per your requirements and then point fingers at me that the proposed solution does NOT work.
problems with your code
1. You copied the junk data to be validated on to position 272 using WHEN=INIT. By doing so, you essentially over laid the last character in your file with your junk data. It should overlaid to position 273
2. You mention that your junk data start from position 10 thru 254 both positions inclusive. So that is a total of 245 bytes and not 244 bytes
3. The next FINDREP is looking for valid characters from position 351. So all the records with junk characters from pos 272 to 350 are ignored. The startpos should be 273.
4. If you want to consider pipe | as a valid character then add that also to the FINDREP list.
Understand the job I gave you and try to make changes accordingly or better yet provide complete details when asked and copy the control cards given as is.
Read this document for a better understanding of FINDREP
It is NOT my fault that you couldn't modify the job I gave you as per your requirements and then point fingers at me that the proposed solution does NOT work....
Hi Kolusu,
Please do not feel offended ...the idea was not to offend you but where I was coming from that I saw a small issue with the code that I had provided..and not with the one that you had provided.
I will certainly take your inputs and re-run the modified code. I will share the results with everyone.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum