2018-05-30 09:27:10 UTC
<warning - long post>
We had the (admittedly not so) bright idea to accomate the request from application development to activate the HEAPCHK LE Option generally. We did this in conjunction with upgrading from 2.1 to 2.3.
As a result, the crappy design of HWIBCPII address space revealed itself. (It had been running without problems under 2.1 for quite a while.)
During IPL (and *waaaay before* JES2 was active) we got these messages:
CEE3798I ATTEMPTING TO TAKE A DUMP FOR ABEND U4087 TO DATA SET: BCPII.D146.T1123091.HWIBCPII
Why is this thing even coming up when it relies on JES2? Why doesn't it wait for JES2?!?
CEA0603I The z/OS Diagnostic Snapshot option failed.
z/OS component CEA is unavailable for processing this request.
Diagnostic data will be missing for the following incident with:
DUMP TITLE: JOBNAME HWIBCPII STEPNAME HWIBCPII USER 4087
DATE AND TIME: 05/26/2018 11:23:09
DUMP DATA SET NAME: BCPII.D146.T1123091.HWIBCPII
REASON: The CEA address space is unavailable.
We do NOT run z/OSMF and we do NOT have CEA configured. Which component automatically calls CEA and then screams that it is not available?
*Then* insult gets added to injury:
ICH408I USER(BCPII ) GROUP($STCGRP ) NAME(BCPII )
SYS1.MCAT.xxxxxx CL(DATASET ) VOL(AWECAT)
INSUFFICIENT ACCESS AUTHORITY
FROM SYS1.MCAT.xxxxxx (G)
ACCESS INTENT(UPDATE ) ACCESS ALLOWED(READ )
IKJ56893I DATA SET BCPII.D146.T1123091.HWIBCPII NOT ALLOCATED+
Of course not! We do not allow access to the master cat for every Tom, Dick and Harry to pollute the master cat with non-defined HLQs. The HLQ BCPII (which is equal to the protected STC userid HWIBCPII runs under) does not have an alias and is not allowed to allocate data sets. *That's* why the LE Option DYNDUMP clearly specifies a general HLQ named TDUMP for each and every transaction dump to get allocated under. IBM in their infinite wisdom have choosen to override the installation-set DYNDUMP LE Default (that only makes sense for actual TSO users, but NOT for LE-enabled STCs, namely to use the userid as HLQ) to enforce the IBM default for HLQs. For security reasons, no protected STC is allowed to allocate data sets under it's own userid, hence HLQ TDUMP to actually *find* all of those transaction dumps in one place.
IEA820I TRANSACTION DUMP REQUESTED BUT NOT TAKEN
AUTOMATIC ALLOCATION OF DUMP DATA SET FAILED
CEE3796I AN ATTEMPT TO DYNAMICALLY TAKE A DUMP WAS NOT SUCCESSFUL. 339
THE ERROR RETURN CODE WAS 00000008 AND THE REASON CODE WAS 00000026.
CEE0374C CONDITION=CEE3250C TOKEN=00040CB2 61C3C5C5 00000001 344
WHILE RUNNING PROGRAM CEEBPCAL WHICH STARTS AT 0000F600
AT THE TIME OF INTERRUPT
CEE0374C CONDITION=CEE3250C TOKEN=00040CB2 61C3C5C5 00000002 355
WHILE RUNNING PROGRAM CEEPIPI
AT THE TIME OF INTERRUPT
HWI018I THE BCPII COMMUNICATION RECOVERY HAS DETECTED AN UNEXPECTED
ERROR. SYSOUT MAY CONTAIN DIAGNOSTICS FOR THIS PROBLEM.
HWI008I BCPII FAILED TO CONNECT TO THE LOCAL CENTRAL PROCESSOR 367
COMPLEX (CPC). RC = 00000FFF, RSN = 00000000. BCPII INITIALIZATION
When we migrated to 2.3, we didn't have a clue what might be wrong with HWIBCPII, so after the system was up and running we just did a "start HWISTART". To our utter confusion, *now* it came up without a hitch.
We later found out why setting HEAPCK globally wasn't a good idea and have now turned it off again. When we reproduced this on our test system, we did an S HWISTART,SUB=MSTR on the assumption that the thing runs sub=mstr when it gets started. We got a dump (abend042 with a 'severe internal error' reason code) for our pains, so quite obviously it does not work under sub=mstr. Starting it under JES, *now* it waits for JES to initialize, probably because something global in the system recognizes that JES2 isn't there yet.
There are a number of things wrong here:
1. If HWIBCPII needs JES2 to function (and obviously it does), it just cannot do all the shenanigans detailed above and *has to* wait for JES2 to come up. If it cannot do that, then it MUST work under the master subsystem. The wishy-washy way it behaves now is not acceptable.
2. Since this is an LE-enabled application, it MUST be able to tolerate all and any LE options set in an installation. Given that it overrides a number of the installation-set LE Options, why doesn't it also override the HEAPCHK option to always be off?
3. No STC running with a protected user should ever be able to allocate data sets under its own userid (or so our auditors tell us). That why it is *bad* design to override the DYNDUMP option that names the HLQ for transaction dumps, especially when running in a system address space deeply entrenched in z/OS.
Thanks for letting me vent!
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN