Friday fun with REXX and PARSE

Discussion:

Friday fun with REXX and PARSE

(too old to reply)

Peter Hunkeler

2017-02-24 09:32:18 UTC

This is some Friday fun with parsing with REXX. First I was baffled with the result, now I understand. So *no* I will not join the TSO/REXX list ;-)
I've got a data set to process with REXX. The records are of format:

"word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"

What I need is each record split into:

var1 = "word1"
var2 = "word2.word3"
var3 = "word4:word5.word6"
var4 = "word7"
var5 = "hh"
var6 = "mm"
var7 = "ss"

Easy, I thought and coded:

PARSE VAR input var1 var2 var3 var4 var5 "." var6 "." var7 .

The result baffled me and was far from anything I understood at first. Here is what the variables look like:

var1 ==> "word1"
var2 ==> "word2"
var3 ==> ""
var4 ==> ""
var5 ==> ""
var6 ==> "word3 word4:word5"
var7 ==> "word6"

Have fun.

--
Peter Hunkeler

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

Itschak Mugzach

2017-02-24 10:57:52 UTC

Peter,

have a look into the data in hex format. there are probably non printable
characters there between word 2 and 3, causing this parsing.

ITschak

Post by Peter Hunkeler
This is some Friday fun with parsing with REXX. First I was baffled with
the result, now I understand. So *no* I will not join the TSO/REXX list ;-)
"word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"
var1 = "word1"
var2 = "word2.word3"
var3 = "word4:word5.word6"
var4 = "word7"
var5 = "hh"
var6 = "mm"
var7 = "ss"
PARSE VAR input var1 var2 var3 var4 var5 "." var6 "." var7 .
The result baffled me and was far from anything I understood at first.
var1 ==> "word1"
var2 ==> "word2"
var3 ==> ""
var4 ==> ""
var5 ==> ""
var6 ==> "word3 word4:word5"
var7 ==> "word6"
Have fun.
--
Peter Hunkeler
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,

--
ITschak Mugzach
*|** IronSphere Platform* *|** An IT GRC for Legacy systems* *| Automated
Security Readiness Reviews (SRR) **|*

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

Elardus Engelbrecht

2017-02-24 11:02:54 UTC

have a look into the data in hex format. there are probably non printable characters there between word 2 and 3, causing this parsing.

No, I got the same results like Peter. No strange characters involved.

Peter needs to slightly, but easily modify his PARSE to get the correct results. Just watch your ':' and '.'... ;-)

Groete / Greetings
Elardus Engelbrecht

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

Peter Hunkeler

2017-02-24 12:44:58 UTC

Post by Elardus Engelbrecht
Peter needs to slightly, but easily modify his PARSE to get the correct results. Just watch your ':' and '.'... ;-)

Nope, carefully read the manual. Throw away what you think is how PARSE works.

--
Peter Hunkeler

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

Steve Horein

2017-02-24 11:04:33 UTC

Looks to me your literal delimiter took precedent over space delimiters.
In other words, PARSE looked for "." first, and found "word3 word4:word5"
between the specified literals.

Post by Peter Hunkeler
This is some Friday fun with parsing with REXX. First I was baffled with
the result, now I understand. So *no* I will not join the TSO/REXX list ;-)
"word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"
var1 = "word1"
var2 = "word2.word3"
var3 = "word4:word5.word6"
var4 = "word7"
var5 = "hh"
var6 = "mm"
var7 = "ss"
PARSE VAR input var1 var2 var3 var4 var5 "." var6 "." var7 .
The result baffled me and was far from anything I understood at first.
var1 ==> "word1"
var2 ==> "word2"
var3 ==> ""
var4 ==> ""
var5 ==> ""
var6 ==> "word3 word4:word5"
var7 ==> "word6"
Have fun.
--
Peter Hunkeler
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

Tony Thigpen

2017-02-24 11:53:14 UTC

Steve has it right. Literals take precedent. So it works like this:

Step 1) split as Temp1 '.' Temp2 '.' Temp3
so: Temp1 = word1 word2
Temp2 = word3 word4:word5
Temp3 = word6 word7 hh.mm.ss
Step 2) split the "temps" based on the parsing between literals:
so: Parse Temp1 with var1 var2 var3 var4 var5
giving: var1 = word1
var2 = word2
var3-var5 = nulls because no more words in Temp1
so: Parse Temp2 with war6
giving var6 = word3 word4:word5 (because only one parse into field)
so: Parse Temp3 with var7
giving var7 = word6 word7 hh.mm.ss (because only one parse into field)

Tony Thigpen

Post by Steve Horein
Looks to me your literal delimiter took precedent over space delimiters.
In other words, PARSE looked for "." first, and found "word3 word4:word5"
between the specified literals.

Post by Peter Hunkeler
This is some Friday fun with parsing with REXX. First I was baffled with
the result, now I understand. So *no* I will not join the TSO/REXX list ;-)
"word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"
var1 = "word1"
var2 = "word2.word3"
var3 = "word4:word5.word6"
var4 = "word7"
var5 = "hh"
var6 = "mm"
var7 = "ss"
PARSE VAR input var1 var2 var3 var4 var5 "." var6 "." var7 .
The result baffled me and was far from anything I understood at first.
var1 ==> "word1"
var2 ==> "word2"
var3 ==> ""
var4 ==> ""
var5 ==> ""
var6 ==> "word3 word4:word5"
var7 ==> "word6"
Have fun.
--
Peter Hunkeler
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

scott Ford

2017-02-24 13:34:45 UTC

Peter,

Same here haven't seen this strangeness in rexx "the wonder horse"...usung
rexx since 1984.

Scott

Yep. This is it. Literals split the source into multiple sources, then
PARSE applies "parsing into words" on the individual source parts.
I'm just astonished I have never before stumbled across this in my 33+
years with a lot of REXX programming.
--
Peter Hunkeler
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
--

Scott Ford
IDMWORKS
z/OS Development

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

Hardee, Chuck

2017-02-24 11:11:27 UTC

I'm probably going to explain this incorrectly, but the jist of the problem is the literal periods in the PARSE statement.

Due to the literal periods ("."), PARSE assigns the values of "word1" and "word2" to var1 and var2 respectively.
Then, PARSE assigns nulls to var3, var4 and var5 bringing the parse pattern and variable's value into sync with the first literal period (".") and PARSE then assigns "word3 word4:word5" to var6.
Finally, var7 is assigned "word6" and the trailing period is "assigned" the remainder, "word7 hh.mm.ss".

Is this what you concluded?

Charles (Chuck) Hardee
Senior Systems Engineer/Database Administration
EAS Information Technology

Thermo Fisher Scientific
300 Industry Drive | Pittsburgh, PA 15275
Phone +1 (724) 517-2633 | Mobile +1 (412) 877-2809 | FAX: +1 (412) 490-9230
***@ThermoFisher.com | www.thermofisher.com

WORLDWIDE CONFIDENTIALITY NOTE: Dissemination, distribution or copying of this e-mail or the information herein by anyone other than the intended recipient, or an employee or agent of a system responsible for delivering the message to the intended recipient, is prohibited. If you are not the intended recipient, please inform the sender and delete all copies.

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of Peter Hunkeler
Sent: Friday, February 24, 2017 4:33 AM
To: IBM-***@LISTSERV.UA.EDU
Subject: Friday fun with REXX and PARSE

This is some Friday fun with parsing with REXX. First I was baffled with the result, now I understand. So *no* I will not join the TSO/REXX list ;-)
I've got a data set to process with REXX. The records are of format:

"word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"

What I need is each record split into:

var1 = "word1"
var2 = "word2.word3"
var3 = "word4:word5.word6"
var4 = "word7"
var5 = "hh"
var6 = "mm"
var7 = "ss"

Easy, I thought and coded:

PARSE VAR input var1 var2 var3 var4 var5 "." var6 "." var7 .

The result baffled me and was far from anything I understood at first. Here is what the variables look like:

var1 ==> "word1"
var2 ==> "word2"
var3 ==> ""
var4 ==> ""
var5 ==> ""
var6 ==> "word3 word4:word5"
var7 ==> "word6"

Have fun.

--
Peter Hunkeler

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

Paul Gilmartin

2017-02-24 18:07:44 UTC

Post by Tony Thigpen
Step 1) split as Temp1 '.' Temp2 '.' Temp3
...
...

The following performs the operation with a single PARSE:

18 *-* parse value space( INPUT ) with var1 " " var2 " " var3 " " var4 " " var5 "." var6 "." var7

Post by Tony Thigpen
V> " word1 word2.word3 word4:word5.word6 word7 hh.mm.ss "

Post by Peter Hunkeler
"word1"
"word2.word3"
"word4:word5.word6"
"word7"
"hh"
"mm"
"ss"

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

Sri h Kolusu

2017-02-24 18:33:18 UTC

It can be done without the value SPACE

STR = "WORD1 WORD2.WORD3 WORD4:WORD5.WORD6 WORD7 HH.MM.SS"
PARSE VAR STR VAR1 ' ' VAR2 ' ' VAR3 ' ' VAR4 ' ' VAR5 '.' VAR6 '.' VAR7

Thanks,
Kolusu

From: Paul Gilmartin <0000000433f07816-dmarc-***@LISTSERV.UA.EDU>
To: IBM-***@LISTSERV.UA.EDU
Date: 02/24/2017 11:08 AM
Subject: Re: Friday fun with REXX and PARSE

Post by Tony Thigpen
Step 1) split as Temp1 '.' Temp2 '.' Temp3
...
...

The following performs the operation with a single PARSE:

18 *-* parse value space( INPUT ) with var1 " " var2 " " var3 " "
var4 " " var5 "." var6 "." var7

Post by Tony Thigpen
V> " word1 word2.word3 word4:word5.word6 word7 hh.mm.ss "

Post by Peter Hunkeler
"word1"
"word2.word3"
"word4:word5.word6"
"word7"
"hh"
"mm"
"ss"

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

Paul Gilmartin

2017-02-24 18:52:48 UTC

Post by Sri h Kolusu
It can be done without the value SPACE
STR = "WORD1 WORD2.WORD3 WORD4:WORD5.WORD6 WORD7 HH.MM.SS"
PARSE VAR STR VAR1 ' ' VAR2 ' ' VAR3 ' ' VAR4 ' ' VAR5 '.' VAR6 '.' VAR7

The SPACE() accounts for the likely eventuality that there be extra blanks as in

Post by Sri h Kolusu
18 *-* parse value space( INPUT ) with var1 " " var2 " " var3 " "
var4 " " var5 "." var6 "." var7

Post by Tony Thigpen
V> " word1 word2.word3 word4:word5.word6 word7 hh.mm.ss "

Post by Peter Hunkeler
"word1"
"word2.word3"
"word4:word5.word6"
"word7"
"hh"
"mm"
"ss"

Did I overdesign? I thought I was just avoiding IBM's bad habit of solving
only the most restricted problem the user reports and avoiding a desirable
solution of a more general case.

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

Randy Hudson

2017-02-26 03:03:47 UTC

Post by Peter Hunkeler
This is some Friday fun with parsing with REXX. First I was baffled with
the result, now I understand. So *no* I will not join the TSO/REXX list
;-)
"word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"
var1 = "word1"
var2 = "word2.word3"
var3 = "word4:word5.word6"
var4 = "word7"
var5 = "hh"
var6 = "mm"
var7 = "ss"
PARSE VAR input var1 var2 var3 var4 var5 "." var6 "." var7 .
The result baffled me and was far from anything I understood at first.
var1 ==> "word1"
var2 ==> "word2"
var3 ==> ""
var4 ==> ""
var5 ==> ""
var6 ==> "word3 word4:word5"
var7 ==> "word6"
Have fun.

It uses your literal "." as anchors. Because those anchor characters appear
within the data as well as separating it, you have to isolate the fields
where they appear from the fields they punctuate first.

PARSE VAR input var1 var2 var3 var4 var8 .
PARSE VAR var8 var5 "." var6 "." var7 .

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

William W. Collier

2017-02-26 20:40:30 UTC

A recent note asked how, in REXX, to parse a record in this format:

"word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"

into these variables.

var1 = "word1"
var2 = "word2.word3"
var3 = "word4:word5.word6"
var4 = "word7"
var5 = "hh"
var6 = "mm"
var7 = "ss"

A friend, Harry Elder (***@gmail.com), offers this solution:

input = "word1 word2.word3 word4:word5.word6 word7 hh mm ss";

do v = 1 to 7;
var.v = word(input,v);
end;

do v = 1 to 7;
say v || ")" var.v;
end;

/* results:

1) word1
2) word2.word3
3) word4:word5.word6
4) word7
5) hh
6) mm
7) ss

*/

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

Gerard Schildberger

2017-02-27 00:24:49 UTC

Post by Peter Hunkeler
"word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"
into these variables.
var1 = "word1"
var2 = "word2.word3"
var3 = "word4:word5.word6"
var4 = "word7"
var5 = "hh"
var6 = "mm"
var7 = "ss"
input = "word1 word2.word3 word4:word5.word6 word7 hh mm ss";
do v = 1 to 7;
var.v = word(input,v);
end;
do v = 1 to 7;
say v || ")" var.v;
end;
1) word1
2) word2.word3
3) word4:word5.word6
4) word7
5) hh
6) mm
7) ss
*/

Except that in the record parsed, the "hh.mm.ss"
part contained periods, not blanks (for separators),
and the VARn (variables) aren't an stemmed array,
but variables appended with a digit.

Here is my solution(s):

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
record = "word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"
say 'record='record

/* the trailing period (below) is just for a sanity check */
/* in case there are trailing fields/comments/whatever. */

parse var record var1 var2 var3 var4 hhmmss
parse var hhmmss var5 '.' var6 "." var7

say ' var1='var1
say ' var2='var2
say ' var3='var3
say ' var4='var4
say 'hhmmss='hhmmss
say ' var5='var5
say ' var6='var6
say ' var7='var7
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

And the REXX program output is:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
record=word1 word2.word3 word4:word5.word6 word7 hh.mm.ss
var1=word1
var2=word2.word3
var3=word4:word5.word6
var4=word7
hhmmss=hh.mm.ss
var5=hh
var6=mm
var7=ss
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@

One could use a single parse statement, and without
assuming what is or isn't in those "wordy" words in
the record, it would be harder to read/understand
the single parse statement. But here goes:

///////////////////////////////////////////////////
record = "word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"
say 'record='record

parse var record var1 var2 var3 var4 . '' -8 var5 "." var6 '.' var7

say ' var1='var1
say ' var2='var2
say ' var3='var3
say ' var4='var4
say ' var5='var5
say ' var6='var6
say ' var7='var7
///////////////////////////////////////////////////

(The thingy before the minus eight is a "null" character.)

And the REXX program output is:

###################################################
record=word1 word2.word3 word4:word5.word6 word7 hh.mm.ss
var1=word1
var2=word2.word3
var3=word4:word5.word6
var4=word7
var5=hh
var6=mm
var7=ss
###################################################

Of course, it goes without saying that those "words"
don't have any imbedded blanks (except those shown
as separators), nor any characters that have no
visible glyphs.
______________________________ Gerard Schildberger

Gerard Schildberger

2017-02-27 00:29:19 UTC

Correction, I didn't cut and paste correctly;
the first REXX program had a period cropped
on the 1st parse statement:

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
record = "word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"
say 'record='record

/* the trailing period (below) is just for a sanity check */
/* in case there are trailing fields/comments/whatever. */

parse var record var1 var2 var3 var4 hhmmss .
parse var hhmmss var5 '.' var6 "." var7

say ' var1='var1
say ' var2='var2
say ' var3='var3
say ' var4='var4
say 'hhmmss='hhmmss
say ' var5='var5
say ' var6='var6
say ' var7='var7

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\

_____________________________ Gerard Schildberger

Paul Gilmartin

2017-02-27 00:48:50 UTC

Post by Peter Hunkeler
"word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"
...
input = "word1 word2.word3 word4:word5.word6 word7 hh mm ss";
...

Well, gee, I think he changed the statement of the problem. Isn't that cheating?

I admit, I did something similar myself in the compact solution I tendered,
and for which I was admonished for needless complexity. But my intent
was to generalize. Did Peter intend that the words be separated only by
single blanks, or did he want to handle tne case of possible multiple blanks,
likely if theinput data are column-aligned?

Yes, I was also dismayed to encounter the behavior of PARSE once.

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

Peter Hunkeler

2017-02-27 06:50:10 UTC

Post by Paul Gilmartin

Post by Peter Hunkeler
"word1 word2.word3 word4:word5.word6 word7 hh.mm.ss"

I admit, I did something similar myself in the compact solution I tendered,

and for which I was admonished for needless complexity.

I absolutely liked your solution, Gil. It's elegant. The one I chose (before the initial post) was making use of the fact (which I did not tell you), that I'm not interested 2-5, so I just split the using "." as well. Kind of cheating, I know :-)

--
Peter Hunkeler

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

16 Replies
26 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Peter Hunkeler 2017-02-24 09:32:18 UTC

Itschak Mugzach 2017-02-24 10:57:52 UTC

Elardus Engelbrecht 2017-02-24 11:02:54 UTC

Peter Hunkeler 2017-02-24 12:44:58 UTC

Steve Horein 2017-02-24 11:04:33 UTC

Tony Thigpen 2017-02-24 11:53:14 UTC

scott Ford 2017-02-24 13:34:45 UTC

Hardee, Chuck 2017-02-24 11:11:27 UTC

Paul Gilmartin 2017-02-24 18:07:44 UTC

Sri h Kolusu 2017-02-24 18:33:18 UTC

Paul Gilmartin 2017-02-24 18:52:48 UTC

Randy Hudson 2017-02-26 03:03:47 UTC

William W. Collier 2017-02-26 20:40:30 UTC

Gerard Schildberger 2017-02-27 00:24:49 UTC

Gerard Schildberger 2017-02-27 00:29:19 UTC

Paul Gilmartin 2017-02-27 00:48:50 UTC

Peter Hunkeler 2017-02-27 06:50:10 UTC

about - legalese

Loading...