Discussion:
Why would LE not trap?
(too old to reply)
Charles Mills
2017-08-25 20:07:12 UTC
Permalink
Raw Message
I have a C++ program compiled with

#pragma runopts( POSIX(ON),TRAP(ON,NOSPIE),NOEXECOPS )

I have my own ESTAEX. On an ABEND, if SDWACLUP is not set, I percolate,
presumably to LE's ESTAE and it drives my C Signal catcher.

It works. In testing, and at most customers, a S0Cx drives the Signal
routine, 100% of the time.

But one customer has twice gotten a S0C4 and in both cases we got an
old-fashioned SYSUDUMP with no Signal. I don't know if we came through the
ESTAEX exit or not. The ESTAE exit logic is not at all complex so some
subtle bug is unlikely. The reported S0C4 is in the main logic, not in the
ESTAE recovery. The fact that it is consistent at one customer leads me to
think it is an environmental factor, not a logic error.

There are no LE options in PARM= or CEEOPTS.

Environment is STC, current release of z/OS (not sure exact version but
probably V2R1).

What should I be looking for? What would effectively override TRAP(ON)?
Would SDWACLUP ever be set on a vanilla S0C4?

It's at a customer and the S0C4 is extremely infrequent so "try this/try
that" is not an option. Why my own ESTAE? So I can deal with unrecoverable
ABENDs, which LE will not. Why NOSPIE? So everything comes through the
ESTAE.

Charles

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Peter Hunkeler
2017-08-26 17:29:59 UTC
Permalink
Raw Message
Post by Charles Mills
What should I be looking for? What would effectively override TRAP(ON)?
Would SDWACLUP ever be set on a vanilla S0C4?
Post by Charles Mills
Why my own ESTAE? So I can deal with unrecoverable ABENDs, which LE will not. Why NOSPIE? So everything comes through the ESTAE.
Dangerous path, you chose. Why do I say this? Because we have learnt the hard way.
We're using a vendor product that chose to install it own ESTAEX upon the init call from the application. Not only did it install its own ESTAE, it did also cancel LE's ESPIE (which has the same effect as running with TRAP(ON,NOSPIE)). It deregisters its ESTAE when the application call the termination function. In-between, the application run, and eventually calls some vendor product interface, but also does all the normal programming stuff under LE.
Installing an own ESTAE is not supported by LE! That's what the manual says. It seemed to work but actually there are situation when you get into trouble with thus setup. We learnt this step by step and from discussion I had with IBM LE people. Finally, I succeeded in convincing the vendor to change their product, and now they use the LE supported way to get notified about problems, namely the CEEEXTAN user exit.


Note that we're a COBOL shop, and COBOL allows operations that loose significant digits in numbers. This causes troubles when the decimal overflow program mask is set, which it is if C code is also part of the application (implicit or explicit).


- If you run with TRAP(ON,NOSPIE), then your own ESTAE must recognize that an S0CA ABEND from a COBOL statement is *not* a problem, and your code must resume the COBOL code. Not easy, believe me.
In addition, this may become a total performance killer. Assume, (and we have seen sucht jobs) that a COBOL program has some code that causes decimal overflow (loosing significant digits). This is intentional, and proper COBOL coding. Assume such an overflow happens thousands of times during the batch run. Further, C code is also involved, so the decimal overflow program mask bit is set.
The result: COBOL code causes thousands of decimal overflows (00A program checks). There is no ESPIE, which can handle this with a short path length. Program check handler takes a shapshot of the system trace table in anticipation that a dump might be taken, then it percolates to RTM, which invokes ESTAE routines. Even if the ESTAE knows how to handle this COBOL decimal overflow, it takes endless time to take the snapshot of the system trace, depending on the trace table size. WE have 15MB per processor and this leads to an elapsed time to take the snapshot of 0.2 to 0.5 seconds !! A nightmare, the application just never completes.


- If you run with TRAP(ON,SPIE), which you can't inhibit, because PARM='/TRAP(ON,SPIE) would override your #pragma TRAP(ON,NOSPIE), then LE's ESPIE will get control for program check, and, LE may decide this is an unrecoverable error. According to the LE lab people, LE's error handler may choose not to percolate to (what it thinks is its own ESTAE), but to cancel the ESTAE and terminate. However, LE, not knowing there is another ESTAE in front of its own, cancels you product's ESTAE. So, your product will not get notified about the error. LE's ESTAE, which is still in place, will gain control and terminate the application.


I understand this does not help much in finding the problem with that one customer.
LE's documentation is not harsh enough in saying you own ESTAEs and ESPIEs can cause troubles. Of course, this is only if the product installs the ESTAE/ESPIE at the beginning of the application, and cancels it at the end. If you call some code that installs its own recovery, does its job, deinstalls its own recovery, and returns, then that should be fine, and not interfere with LE.


The vendor of our product has change it code an all those problems are gone.
--
Peter Hunkeler

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Bernd Oppolzer
2017-08-26 21:00:28 UTC
Permalink
Raw Message
Post by Peter Hunkeler
Note that we're a COBOL shop, and COBOL allows operations that loose
significant digits in numbers. This causes troubles when the decimal
overflow program mask is set, which it is if C code is also part of
the application (implicit or explicit).
- If you run with TRAP(ON,NOSPIE), then your own ESTAE must recognize that an S0CA ABEND from a COBOL statement is *not* a problem, and your code must resume the COBOL code. Not easy, believe me.
In addition, this may become a total performance killer. Assume, (and we have seen sucht jobs) that a COBOL program has some code that causes decimal overflow (loosing significant digits). This is intentional, and proper COBOL coding. Assume such an overflow happens thousands of times during the batch run. Further, C code is also involved, so the decimal overflow program mask bit is set.
The result: COBOL code causes thousands of decimal overflows (00A program checks). There is no ESPIE, which can handle this with a short path length. ...
We had similar problems in the 1990s.

Our insurance applications, written mostly in PL/1, used the 0CA and 0C8
program
masks, so that decimal and binary overflows didn't go undetected.

This worked fine, even with C modules called.

But suddenly, in the 1990s, a C compiler version arrived, where certain
operations
(like modulo) were implemented using arithmetic left shifts, which led
to 0C8 abends,
if the 0c8 mask bit is set. Of course, the 0C8 mask bit was never set in
a pure C environment,
but in our environment involving PL/1 FIXEDOVERFLOW, it was.

Instead of repairing the C code generation (which would have been
natural from my point
of view, given the fact that the previous C compiler did it right), IBM
did two things:

- first they discussed with us, that it is not OK to set the
FIXEDOVERFLOW condition, when
calling C from PL/1

- and then they changed LE, so that the 0C8s are handled by the LE error
handler,
which is a performance nightmare.

Because we have interface modules, that get control between each
(external) call between
two modules, and the interfaces know about the programming languages of
caller and callee,
we decided to switch off the 0C8 mask bit there, and to restore it, when
returning to PL/1.

Kind regards

Bernd

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2017-08-26 21:06:07 UTC
Permalink
Raw Message
Thanks. No COBOL in this picture.

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On
Behalf Of Peter Hunkeler
Sent: Saturday, August 26, 2017 1:31 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: AW: Why would LE not trap?
Post by Charles Mills
What should I be looking for? What would effectively override TRAP(ON)?
Would SDWACLUP ever be set on a vanilla S0C4?
Post by Charles Mills
Why my own ESTAE? So I can deal with unrecoverable ABENDs, which LE will
not. Why NOSPIE? So everything comes through the ESTAE.


Dangerous path, you chose. Why do I say this? Because we have learnt the
hard way.
We're using a vendor product that chose to install it own ESTAEX upon the
init call from the application. Not only did it install its own ESTAE, it
did also cancel LE's ESPIE (which has the same effect as running with
TRAP(ON,NOSPIE)). It deregisters its ESTAE when the application call the
termination function. In-between, the application run, and eventually calls
some vendor product interface, but also does all the normal programming
stuff under LE.
Installing an own ESTAE is not supported by LE! That's what the manual says.
It seemed to work but actually there are situation when you get into trouble
with thus setup. We learnt this step by step and from discussion I had with
IBM LE people. Finally, I succeeded in convincing the vendor to change their
product, and now they use the LE supported way to get notified about
problems, namely the CEEEXTAN user exit.


Note that we're a COBOL shop, and COBOL allows operations that loose
significant digits in numbers. This causes troubles when the decimal
overflow program mask is set, which it is if C code is also part of the
application (implicit or explicit).


- If you run with TRAP(ON,NOSPIE), then your own ESTAE must recognize that
an S0CA ABEND from a COBOL statement is *not* a problem, and your code must
resume the COBOL code. Not easy, believe me.
In addition, this may become a total performance killer. Assume, (and we
have seen sucht jobs) that a COBOL program has some code that causes decimal
overflow (loosing significant digits). This is intentional, and proper COBOL
coding. Assume such an overflow happens thousands of times during the batch
run. Further, C code is also involved, so the decimal overflow program mask
bit is set.
The result: COBOL code causes thousands of decimal overflows (00A program
checks). There is no ESPIE, which can handle this with a short path length.
Program check handler takes a shapshot of the system trace table in
anticipation that a dump might be taken, then it percolates to RTM, which
invokes ESTAE routines. Even if the ESTAE knows how to handle this COBOL
decimal overflow, it takes endless time to take the snapshot of the system
trace, depending on the trace table size. WE have 15MB per processor and
this leads to an elapsed time to take the snapshot of 0.2 to 0.5 seconds !!
A nightmare, the application just never completes.


- If you run with TRAP(ON,SPIE), which you can't inhibit, because
PARM='/TRAP(ON,SPIE) would override your #pragma TRAP(ON,NOSPIE), then LE's
ESPIE will get control for program check, and, LE may decide this is an
unrecoverable error. According to the LE lab people, LE's error handler may
choose not to percolate to (what it thinks is its own ESTAE), but to cancel
the ESTAE and terminate. However, LE, not knowing there is another ESTAE in
front of its own, cancels you product's ESTAE. So, your product will not get
notified about the error. LE's ESTAE, which is still in place, will gain
control and terminate the application.


I understand this does not help much in finding the problem with that one
customer.
LE's documentation is not harsh enough in saying you own ESTAEs and ESPIEs
can cause troubles. Of course, this is only if the product installs the
ESTAE/ESPIE at the beginning of the application, and cancels it at the end.
If you call some code that installs its own recovery, does its job,
deinstalls its own recovery, and returns, then that should be fine, and not
interfere with LE.


The vendor of our product has change it code an all those problems are
gone.


--
Peter Hunkeler

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to ***@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Binyamin Dissen
2017-08-26 20:41:55 UTC
Permalink
Raw Message
The trace table in the SYSUDUMP should show if the ESTAE(x) got control.

But why do you want your ESTAE to do when the abend is unrecoverable (such as
CANCEL/DETACH)? Some failures will not go thru an ESTAE at all.

On Fri, 25 Aug 2017 16:08:09 -0400 Charles Mills <***@MCN.ORG> wrote:

:>I have a C++ program compiled with
:>
:>#pragma runopts( POSIX(ON),TRAP(ON,NOSPIE),NOEXECOPS )
:>
:>I have my own ESTAEX. On an ABEND, if SDWACLUP is not set, I percolate,
:>presumably to LE's ESTAE and it drives my C Signal catcher.
:>
:>It works. In testing, and at most customers, a S0Cx drives the Signal
:>routine, 100% of the time.
:>
:>But one customer has twice gotten a S0C4 and in both cases we got an
:>old-fashioned SYSUDUMP with no Signal. I don't know if we came through the
:>ESTAEX exit or not. The ESTAE exit logic is not at all complex so some
:>subtle bug is unlikely. The reported S0C4 is in the main logic, not in the
:>ESTAE recovery. The fact that it is consistent at one customer leads me to
:>think it is an environmental factor, not a logic error.
:>
:>There are no LE options in PARM= or CEEOPTS.
:>
:>Environment is STC, current release of z/OS (not sure exact version but
:>probably V2R1).
:>
:>What should I be looking for? What would effectively override TRAP(ON)?
:>Would SDWACLUP ever be set on a vanilla S0C4?
:>
:>It's at a customer and the S0C4 is extremely infrequent so "try this/try
:>that" is not an option. Why my own ESTAE? So I can deal with unrecoverable
:>ABENDs, which LE will not. Why NOSPIE? So everything comes through the
:>ESTAE.
:>
:>Charles
:>
:>----------------------------------------------------------------------
:>For IBM-MAIN subscribe / signoff / archive access instructions,
:>send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

--
Binyamin Dissen <***@dissensoftware.com>
http://www.dissensoftware.com

Director, Dissen Software, Bar & Grill - Israel


Should you use the mailblocks package and expect a response from me,
you should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems,
especially those from irresponsible companies.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2017-08-27 16:58:20 UTC
Permalink
Raw Message
Post by Binyamin Dissen
But why do you want your ESTAE to do when the abend is unrecoverable
Short answer: to free some system-owned ECSA that we allocated.

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On
Behalf Of Binyamin Dissen
Sent: Saturday, August 26, 2017 4:43 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: Re: Why would LE not trap?

The trace table in the SYSUDUMP should show if the ESTAE(x) got control.

But why do you want your ESTAE to do when the abend is unrecoverable (such
as CANCEL/DETACH)? Some failures will not go thru an ESTAE at all.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Binyamin Dissen
2017-08-27 18:39:50 UTC
Permalink
Raw Message
Then you cannot completely rely on ESTAE(X).

I would suggest reading on RESMGR.

On Sun, 27 Aug 2017 12:59:20 -0400 Charles Mills <***@MCN.ORG> wrote:

:>> But why do you want your ESTAE to do when the abend is unrecoverable
:>
:>Short answer: to free some system-owned ECSA that we allocated.
:>
:>Charles
:>
:>
:>-----Original Message-----
:>From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On
:>Behalf Of Binyamin Dissen
:>Sent: Saturday, August 26, 2017 4:43 PM
:>To: IBM-***@LISTSERV.UA.EDU
:>Subject: Re: Why would LE not trap?
:>
:>The trace table in the SYSUDUMP should show if the ESTAE(x) got control.
:>
:>But why do you want your ESTAE to do when the abend is unrecoverable (such
:>as CANCEL/DETACH)? Some failures will not go thru an ESTAE at all.

--
Binyamin Dissen <***@dissensoftware.com>
http://www.dissensoftware.com

Director, Dissen Software, Bar & Grill - Israel


Should you use the mailblocks package and expect a response from me,
you should preauthorize the dissensoftware.com domain.

I very rarely bother responding to challenge/response systems,
especially those from irresponsible companies.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Bernd Oppolzer
2017-08-26 21:25:24 UTC
Permalink
Raw Message
Post by Charles Mills
I have a C++ program compiled with
#pragma runopts( POSIX(ON),TRAP(ON,NOSPIE),NOEXECOPS )
I have my own ESTAEX. On an ABEND, if SDWACLUP is not set, I percolate,
presumably to LE's ESTAE and it drives my C Signal catcher.
It works. In testing, and at most customers, a S0Cx drives the Signal
routine, 100% of the time.
But one customer has twice gotten a S0C4 and in both cases we got an
old-fashioned SYSUDUMP with no Signal. I don't know if we came through the
ESTAEX exit or not. The ESTAE exit logic is not at all complex so some
subtle bug is unlikely. The reported S0C4 is in the main logic, not in the
ESTAE recovery. The fact that it is consistent at one customer leads me to
think it is an environmental factor, not a logic error.
There are no LE options in PARM= or CEEOPTS.
don't know, but:

isn't there a third place where LE options come from?
Maybe a installation default?

I would take a look at your own installation; IIRC some of the LE reports
shows all LE options in effect and where they come from, and there should
be a third source besides PARM and CEEOPTS, that is:
installations defaults. And maybe the installation defaults are
TRAP(OFF) ??

My former customer would have used TRAP(OFF) ... because he had his
own ESPIE and ESTAE routines (at least before that problem with the S0C8s
due to the wrong C code generation occured). And he would well used this
as an installation default.
Post by Charles Mills
Environment is STC, current release of z/OS (not sure exact version but
probably V2R1).
What should I be looking for? What would effectively override TRAP(ON)?
Would SDWACLUP ever be set on a vanilla S0C4?
It's at a customer and the S0C4 is extremely infrequent so "try this/try
that" is not an option. Why my own ESTAE? So I can deal with unrecoverable
ABENDs, which LE will not. Why NOSPIE? So everything comes through the
ESTAE.
Charles
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Bernd Oppolzer
2017-08-26 21:40:50 UTC
Permalink
Raw Message
https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/com.ibm.zos.v2r1.ceeam00/spetro.htm

Interesting:

For C++ applications, the following values are not allowed for compilation:

* NOEXECOPS | EXECOPS
* NOREDIR | REDIR
* NOARGPARSE | ARGPARSE

if NOEXECOPS is not allowed on #pragma runopts, will the other
options work anyway?

Obviously, there are many ways to specify run time options;
it is not totally clear to me, which ways take precedence ...

Kind regards

Bernd
Post by Bernd Oppolzer
Post by Charles Mills
I have a C++ program compiled with
#pragma runopts( POSIX(ON),TRAP(ON,NOSPIE),NOEXECOPS )
I have my own ESTAEX. On an ABEND, if SDWACLUP is not set, I percolate,
presumably to LE's ESTAE and it drives my C Signal catcher.
It works. In testing, and at most customers, a S0Cx drives the Signal
routine, 100% of the time.
But one customer has twice gotten a S0C4 and in both cases we got an
old-fashioned SYSUDUMP with no Signal. I don't know if we came through the
ESTAEX exit or not. The ESTAE exit logic is not at all complex so some
subtle bug is unlikely. The reported S0C4 is in the main logic, not in the
ESTAE recovery. The fact that it is consistent at one customer leads me to
think it is an environmental factor, not a logic error.
There are no LE options in PARM= or CEEOPTS.
isn't there a third place where LE options come from?
Maybe a installation default?
I would take a look at your own installation; IIRC some of the LE reports
shows all LE options in effect and where they come from, and there should
installations defaults. And maybe the installation defaults are
TRAP(OFF) ??
My former customer would have used TRAP(OFF) ... because he had his
own ESPIE and ESTAE routines (at least before that problem with the S0C8s
due to the wrong C code generation occured). And he would well used this
as an installation default.
Post by Charles Mills
Environment is STC, current release of z/OS (not sure exact version but
probably V2R1).
What should I be looking for? What would effectively override TRAP(ON)?
Would SDWACLUP ever be set on a vanilla S0C4?
It's at a customer and the S0C4 is extremely infrequent so "try this/try
that" is not an option. Why my own ESTAE? So I can deal with
unrecoverable
ABENDs, which LE will not. Why NOSPIE? So everything comes through the
ESTAE.
Charles
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2017-08-26 22:13:07 UTC
Permalink
Raw Message
Interesting about NOEXECOPS. Not sure what I was trying to accomplish there.
if NOEXECOPS is not allowed on #pragma runopts, will the other options work anyway?
I assure you this works 100% of the time in testing, and at most customers, so the NOEXECOPS is at least generally not hurting us.

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of Bernd Oppolzer
Sent: Saturday, August 26, 2017 5:42 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: Re: Why would LE not trap?

https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.1.0/com.ibm.zos.v2r1.ceeam00/spetro.htm

Interesting:

For C++ applications, the following values are not allowed for compilation:

* NOEXECOPS | EXECOPS
* NOREDIR | REDIR
* NOARGPARSE | ARGPARSE

if NOEXECOPS is not allowed on #pragma runopts, will the other options work anyway?

Obviously, there are many ways to specify run time options; it is not totally clear to me, which ways take precedence ...

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2017-08-26 22:11:20 UTC
Permalink
Raw Message
Post by Bernd Oppolzer
isn't there a third place where LE options come from?
Yes, absolutely, there are installation defaults. Several sets: CICS, POSIX, I have forgotten what all. I guess that is part of my question here: aren't they defaults? Is there any way installation "stuff" of some sort overrides #pragma?

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of Bernd Oppolzer
Sent: Saturday, August 26, 2017 5:26 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: Re: Why would LE not trap?
Post by Bernd Oppolzer
I have a C++ program compiled with
#pragma runopts( POSIX(ON),TRAP(ON,NOSPIE),NOEXECOPS )
I have my own ESTAEX. On an ABEND, if SDWACLUP is not set, I
percolate, presumably to LE's ESTAE and it drives my C Signal catcher.
It works. In testing, and at most customers, a S0Cx drives the Signal
routine, 100% of the time.
But one customer has twice gotten a S0C4 and in both cases we got an
old-fashioned SYSUDUMP with no Signal. I don't know if we came through
the ESTAEX exit or not. The ESTAE exit logic is not at all complex so
some subtle bug is unlikely. The reported S0C4 is in the main logic,
not in the ESTAE recovery. The fact that it is consistent at one
customer leads me to think it is an environmental factor, not a logic error.
There are no LE options in PARM= or CEEOPTS.
don't know, but:

isn't there a third place where LE options come from?
Maybe a installation default?

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Peter Hunkeler
2017-08-27 19:27:36 UTC
Permalink
Raw Message
Post by Charles Mills
Post by Bernd Oppolzer
isn't there a third place where LE options come from?
Yes, absolutely, there are installation defaults. Several sets: CICS, POSIX, I have forgotten what all. I guess that is part of my question here: aren't they defaults? Is there any way installation "stuff" of some sort overrides #pragma?
My first thought, too, but I was not sure about the "order of precedence", so I looked it up and found that the #pragma options could be overridden by PARM=, or //CEEOPTS DD, only. So in your case, it seems that the program is running with TRAP(ON,NOSPIE).


The fine IBM Knowledge Center did not want to help me post a direct link (arrrgh...), so please look up the "order of precedence" in the LE Programmer's Guide.
--
Peter Hunkeler




----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2017-08-27 19:44:15 UTC
Permalink
Raw Message
That is my impression also. I was wondering if anyone here knew anything
different, or perhaps a fourth source that overrode the other three.

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On
Behalf Of Peter Hunkeler
Sent: Sunday, August 27, 2017 3:29 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: AW: Re: Why would LE not trap?
Post by Charles Mills
Post by Bernd Oppolzer
isn't there a third place where LE options come from?
Yes, absolutely, there are installation defaults. Several sets: CICS,
POSIX, I have forgotten what all. I guess that is part of my question here:
aren't they defaults? Is there any way installation "stuff" of some sort
overrides #pragma?


My first thought, too, but I was not sure about the "order of precedence",
so I looked it up and found that the #pragma options could be overridden by
PARM=, or //CEEOPTS DD, only. So in your case, it seems that the program is
running with TRAP(ON,NOSPIE).


The fine IBM Knowledge Center did not want to help me post a direct link
(arrrgh...), so please look up the "order of precedence" in the LE
Programmer's Guide.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2017-08-27 16:58:27 UTC
Permalink
Raw Message
Well, I now know a little more and am a little mystified.

I had this sudden thought that perhaps the difference at the one customer
was that the two S0C4's we have experienced there would have happened in
assembler code running AMODE 64. (The C++ code is all AMODE 31.) So today I
coded up some test code to force a S0C4 in AMODE 64 and sure enough, same
results, system SYSUDUMP, no LE recovery.

I added some debugging WTOs to the ESTAE recovery so I could see what was
happenning. My ESTAE recovery routine is getting driven. I am (as intended)
percolating. LE's recovery routine -- which admittedly I am abusing a bit --
apparently is boggled by AMODE 64 and is in turn percolating to MVS. That at
least would explain what I am seeing.

So my questions today to this august group are

1. In what AMODE will a recovery routine be entered if the ESTAE was issued
in AMODE 31 but the exception happened in AMODE 64? I don't see that in the
manuals. (It's probably there -- I just don't see it.)
2. If the answer to (1.) is AMODE 64, how do I change that? I tried SETRP
RETRYAMODE=31 but got an MNOTE that it was invalid with Percolate (and
admittedly, the parameter is RETRYAMODE, not PERCOLATEAMODE -- it is for
retry, not percolation).

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On
Behalf Of Charles Mills
Sent: Friday, August 25, 2017 4:08 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: Why would LE not trap?

I have a C++ program compiled with

#pragma runopts( POSIX(ON),TRAP(ON,NOSPIE),NOEXECOPS )

I have my own ESTAEX. On an ABEND, if SDWACLUP is not set, I percolate,
presumably to LE's ESTAE and it drives my C Signal catcher.

It works. In testing, and at most customers, a S0Cx drives the Signal
routine, 100% of the time.

But one customer has twice gotten a S0C4 and in both cases we got an
old-fashioned SYSUDUMP with no Signal. I don't know if we came through the
ESTAEX exit or not. The ESTAE exit logic is not at all complex so some
subtle bug is unlikely. The reported S0C4 is in the main logic, not in the
ESTAE recovery. The fact that it is consistent at one customer leads me to
think it is an environmental factor, not a logic error.

There are no LE options in PARM= or CEEOPTS.

Environment is STC, current release of z/OS (not sure exact version but
probably V2R1).

What should I be looking for? What would effectively override TRAP(ON)?
Would SDWACLUP ever be set on a vanilla S0C4?

It's at a customer and the S0C4 is extremely infrequent so "try this/try
that" is not an option. Why my own ESTAE? So I can deal with unrecoverable
ABENDs, which LE will not. Why NOSPIE? So everything comes through the
ESTAE.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
B***@T-ONLINE.DE
2017-08-27 17:33:05 UTC
Permalink
Raw Message
is it possible to Set the Amode to 31 in the estae Routine? the estae
Routine should be able to detect that the Problem occured while executing
in Amode 64??


------------------------------------------------------------------------
Gesendet mit der Telekom Mail App
<http://www.t-online.de/service/redir/email_app_android_sendmail_footer.htm>



--- Original-Nachricht ---
Von: Charles Mills
Betreff: Re: Why would LE not trap?
Datum: 27.08.2017, 18:59 Uhr
An: IBM-***@LISTSERV.UA.EDU





Well, I now know a little more and am a little mystified.

I had this sudden thought that perhaps the difference at the one customer
was that the two S0C4's we have experienced there would have happened in
assembler code running AMODE 64. (The C++ code is all AMODE 31.) So today I
coded up some test code to force a S0C4 in AMODE 64 and sure enough, same
results, system SYSUDUMP, no LE recovery.

I added some debugging WTOs to the ESTAE recovery so I could see what was
happenning. My ESTAE recovery routine is getting driven. I am (as intended)
percolating. LE's recovery routine -- which admittedly I am abusing a bit
--
apparently is boggled by AMODE 64 and is in turn percolating to MVS. That
at
least would explain what I am seeing.

So my questions today to this august group are

1. In what AMODE will a recovery routine be entered if the ESTAE was issued
in AMODE 31 but the exception happened in AMODE 64? I don't see that in the
manuals. (It's probably there -- I just don't see it.)
2. If the answer to (1.) is AMODE 64, how do I change that? I tried SETRP
RETRYAMODE=31 but got an MNOTE that it was invalid with Percolate (and
admittedly, the parameter is RETRYAMODE, not PERCOLATEAMODE -- it is for
retry, not percolation).

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On
Behalf Of Charles Mills
Sent: Friday, August 25, 2017 4:08 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: Why would LE not trap?

I have a C++ program compiled with

#pragma runopts( POSIX(ON),TRAP(ON,NOSPIE),NOEXECOPS )

I have my own ESTAEX. On an ABEND, if SDWACLUP is not set, I percolate,
presumably to LE's ESTAE and it drives my C Signal catcher.

It works. In testing, and at most customers, a S0Cx drives the Signal
routine, 100% of the time.

But one customer has twice gotten a S0C4 and in both cases we got an
old-fashioned SYSUDUMP with no Signal. I don't know if we came through the
ESTAEX exit or not. The ESTAE exit logic is not at all complex so some
subtle bug is unlikely. The reported S0C4 is in the main logic, not in the
ESTAE recovery. The fact that it is consistent at one customer leads me to
think it is an environmental factor, not a logic error.

There are no LE options in PARM= or CEEOPTS.

Environment is STC, current release of z/OS (not sure exact version but
probably V2R1).

What should I be looking for? What would effectively override TRAP(ON)?
Would SDWACLUP ever be set on a vanilla S0C4?

It's at a customer and the S0C4 is extremely infrequent so "try this/try
that" is not an option. Why my own ESTAE? So I can deal with unrecoverable
ABENDs, which LE will not. Why NOSPIE? So everything comes through the
ESTAE.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu <http://listserv.ua.edu> with the
message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2017-08-27 17:58:32 UTC
Permalink
Raw Message
The recovery routine is entered in the AMODE of the ESTAE(X) macro.

Yes, it should be able to determine the AMODE of the error from the z/Arch PSW but I have not tried that yet.

A retry routine is entered -- not sure of the default -- but you can control the AMODE with SETRP.

A percolation routine is just a different recovery routine and it is entered in the AMODE of its ESTAE(X) macro.

I've got more of a handle on this -- going to start a new thread.

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of ***@t-online.de
Sent: Sunday, August 27, 2017 1:34 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: AW: Re: Why would LE not trap?

is it possible to Set the Amode to 31 in the estae Routine? the estae Routine should be able to detect that the Problem occured while executing in Amode 64??

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Jim Mulder
2017-08-27 17:41:23 UTC
Permalink
Raw Message
AMODE

ESTAE-type recovery exits receive control in the AMODE that
was current at the time-of-set (time-of-PC AMODE for ARRs)
with the following exceptions:

?ARR, IEAARR, and ESTAEX exits receive control in AMODE 31
instead of AMODE 24 when established for AMODE 24 programs

Jim Mulder z/OS Diagnosis, Design, Development, Test IBM Corp.
Poughkeepsie NY
Date: 08/27/2017 01:38 PM
Subject: Re: Why would LE not trap?
Well, I now know a little more and am a little mystified.
I had this sudden thought that perhaps the difference at the one customer
was that the two S0C4's we have experienced there would have happened in
assembler code running AMODE 64. (The C++ code is all AMODE 31.) So today I
coded up some test code to force a S0C4 in AMODE 64 and sure enough, same
results, system SYSUDUMP, no LE recovery.
I added some debugging WTOs to the ESTAE recovery so I could see what was
happenning. My ESTAE recovery routine is getting driven. I am (as intended)
percolating. LE's recovery routine -- which admittedly I am abusing a bit --
apparently is boggled by AMODE 64 and is in turn percolating to MVS. That at
least would explain what I am seeing.
So my questions today to this august group are
1. In what AMODE will a recovery routine be entered if the ESTAE was issued
in AMODE 31 but the exception happened in AMODE 64? I don't see that in the
manuals. (It's probably there -- I just don't see it.)
2. If the answer to (1.) is AMODE 64, how do I change that? I tried SETRP
RETRYAMODE=31 but got an MNOTE that it was invalid with Percolate (and
admittedly, the parameter is RETRYAMODE, not PERCOLATEAMODE -- it is for
retry, not percolation).
Charles
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2017-08-27 17:58:25 UTC
Permalink
Raw Message
Sorry. I meant "In what AMODE will a percolation routine be entered?" but I
guess the answer is "in the AMODE of its ESTAE (but not 24)."

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On
Behalf Of Jim Mulder
Sent: Sunday, August 27, 2017 1:42 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: Re: Why would LE not trap?

AMODE

ESTAE-type recovery exits receive control in the AMODE that was current at
the time-of-set (time-of-PC AMODE for ARRs) with the following exceptions:

?ARR, IEAARR, and ESTAEX exits receive control in AMODE 31 instead of AMODE
24 when established for AMODE 24 programs

Jim Mulder z/OS Diagnosis, Design, Development, Test IBM Corp.
Poughkeepsie NY
Date: 08/27/2017 01:38 PM
Subject: Re: Why would LE not trap?
Well, I now know a little more and am a little mystified.
I had this sudden thought that perhaps the difference at the one customer
was that the two S0C4's we have experienced there would have happened
in assembler code running AMODE 64. (The C++ code is all AMODE 31.) So
today I
coded up some test code to force a S0C4 in AMODE 64 and sure enough, same
results, system SYSUDUMP, no LE recovery.
I added some debugging WTOs to the ESTAE recovery so I could see what was
happenning. My ESTAE recovery routine is getting driven. I am (as intended)
percolating. LE's recovery routine -- which admittedly I am abusing a bit --
apparently is boggled by AMODE 64 and is in turn percolating to MVS. That at
least would explain what I am seeing.
So my questions today to this august group are
1. In what AMODE will a recovery routine be entered if the ESTAE was issued
in AMODE 31 but the exception happened in AMODE 64? I don't see that
in
the
manuals. (It's probably there -- I just don't see it.) 2. If the
answer to (1.) is AMODE 64, how do I change that? I tried
SETRP
RETRYAMODE=31 but got an MNOTE that it was invalid with Percolate (and
admittedly, the parameter is RETRYAMODE, not PERCOLATEAMODE -- it is
for retry, not percolation).
Charles
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to ***@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2017-08-27 19:43:35 UTC
Permalink
Raw Message
And to close the loop: Why would LE not trap? Because it's AMODE 31 LE and
the S0C4 happened in AMODE 64.

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On
Behalf Of Charles Mills
Sent: Friday, August 25, 2017 4:08 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: Why would LE not trap?

I have a C++ program compiled with

#pragma runopts( POSIX(ON),TRAP(ON,NOSPIE),NOEXECOPS )

I have my own ESTAEX. On an ABEND, if SDWACLUP is not set, I percolate,
presumably to LE's ESTAE and it drives my C Signal catcher.

It works. In testing, and at most customers, a S0Cx drives the Signal
routine, 100% of the time.

But one customer has twice gotten a S0C4 and in both cases we got an
old-fashioned SYSUDUMP with no Signal. I don't know if we came through the
ESTAEX exit or not. The ESTAE exit logic is not at all complex so some
subtle bug is unlikely. The reported S0C4 is in the main logic, not in the
ESTAE recovery. The fact that it is consistent at one customer leads me to
think it is an environmental factor, not a logic error.

There are no LE options in PARM= or CEEOPTS.

Environment is STC, current release of z/OS (not sure exact version but
probably V2R1).

What should I be looking for? What would effectively override TRAP(ON)?
Would SDWACLUP ever be set on a vanilla S0C4?

It's at a customer and the S0C4 is extremely infrequent so "try this/try
that" is not an option. Why my own ESTAE? So I can deal with unrecoverable
ABENDs, which LE will not. Why NOSPIE? So everything comes through the
ESTAE.

Charles

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send email
to ***@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Loading...