Discussion:
APAR'd Re: What casues IPL/XCF to read the CFRM data set for the policy ?
Add Reply
J Ellis
2017-08-01 14:24:26 UTC
Reply
Permalink
Raw Message
from IBM XCF:
"APAR OA53531 is now open to address this condition.
Our change will likely to cause the pending policy to be activated upon
the next ipl rather than waiting till the last failed-persisten str to be
deleted from the to-be-deleted CF as encountered by this customer"

here's the explanation as near as we have concluded -
(note that we because of the way our consoles are configured and the way we use a scripting language to 'one button' IPL from client workstations, we will never know what caused the original IPL to fail)
in April we installed a new CEC, it IPL'd successfully using the the CEC's internal CF.
Shortly after the IPL a new policy with an incorrect serial number and partion number was introduced via a SETXCF operator command.
this policy went PENDING and no one noticed for whatever reason.
it has been concluded that it went pending because the LOGREC structure was marked persistent.
months go by with successful ipl's, with the 'bad' policy still pending because of the logrec structure ...

come a Sunday morning when most of tech support is on vacation :-)

an IPL fails, and additional IPL's are attempted by operations, (we and IBM are still researching the hardware logs to see if anyone can determine original failure)
tech support is called and has the CF bounced, this clears the connection issue, subsequent IPL fails because GRS can't find the CF, because the bad policy is now current.
we IPL with GRS=NONE to get system up, see where the incorrect policy is in place (looking at IXC messages from IPL)
write out correct policy and activate it.
re-ipl, all is good

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Vernooij, Kees - KLM , ITOPT1
2017-08-01 14:43:23 UTC
Reply
Permalink
Raw Message
After reading this several times, I wonder what the likely solution will solve:

- It will bring the activation of the new structure forward to the first subsequent IPL i.s.o. an unpredictable IPL in the further away future. Depending on IPL frequencies, this event might be several weeks/months in the future i.s.o many week/months. I doubt if this will help in problem determination.

- How can the new policy be activated in a Sysplex on 'the next ipl' of one system, if other systems in the Sysplex still have a connection to the old CF and possibly to the logrec structure? Will the connections be forced? And will the CF be forced from the Sysplex?

- A solution could be to verify the new policy on activation and if conflicting situations are encountered, drop the activation i.s.o. leaving it pending. At that same moment, the customer is informed of the problem and can investigate what was wrong with the policy.

Thanks,
Kees.
-----Original Message-----
Behalf Of J Ellis
Sent: 01 August, 2017 16:26
Subject: APAR'd Re: What casues IPL/XCF to read the CFRM data set for
the policy ?
"APAR OA53531 is now open to address this condition.
Our change will likely to cause the pending policy to be activated upon
the next ipl rather than waiting till the last failed-persisten str to be
deleted from the to-be-deleted CF as encountered by this customer"
here's the explanation as near as we have concluded -
(note that we because of the way our consoles are configured and the way
we use a scripting language to 'one button' IPL from client
workstations, we will never know what caused the original IPL to fail)
in April we installed a new CEC, it IPL'd successfully using the the CEC's internal CF.
Shortly after the IPL a new policy with an incorrect serial number and
partion number was introduced via a SETXCF operator command.
this policy went PENDING and no one noticed for whatever reason.
it has been concluded that it went pending because the LOGREC structure
was marked persistent.
months go by with successful ipl's, with the 'bad' policy still pending
because of the logrec structure ...
come a Sunday morning when most of tech support is on vacation :-)
an IPL fails, and additional IPL's are attempted by operations, (we and
IBM are still researching the hardware logs to see if anyone can
determine original failure)
tech support is called and has the CF bounced, this clears the
connection issue, subsequent IPL fails because GRS can't find the CF,
because the bad policy is now current.
we IPL with GRS=NONE to get system up, see where the incorrect policy is
in place (looking at IXC messages from IPL)
write out correct policy and activate it.
re-ipl, all is good
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
********************************************************
For information, services and offers, please visit our web site: http://www.klm.com. This e-mail and any attachment may contain confidential and privileged material intended for the addressee only. If you are not the addressee, you are notified that no part of the e-mail or any attachment may be disclosed, copied or distributed, and that any other action related to this e-mail or attachment is strictly prohibited, and may be unlawful. If you have received this e-mail by error, please notify the sender immediately by return e-mail, and delete this message.

Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be liable for the incorrect or incomplete transmission of this e-mail or any attachments, nor responsible for any delay in receipt.
Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch Airlines) is registered in Amstelveen, The Netherlands, with registered number 33014286
********************************************************


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Carmen Vitullo
2017-08-01 15:03:47 UTC
Reply
Permalink
Raw Message
Agree, I see possibly a new IXC message, WTOR for the operator to respond to
Verify automatic activation of CFRM policy CFMRPOLN Reply "U" or 'N' to keep CFRMPOLO as the active policy


----- Original Message -----

From: "Kees Vernooij (ITOPT1) - KLM" <***@KLM.COM>
To: IBM-***@LISTSERV.UA.EDU
Sent: Tuesday, August 1, 2017 9:44:27 AM
Subject: Re: APAR'd Re: What casues IPL/XCF to read the CFRM data set for the policy ?

After reading this several times, I wonder what the likely solution will solve:

- It will bring the activation of the new structure forward to the first subsequent IPL i.s.o. an unpredictable IPL in the further away future. Depending on IPL frequencies, this event might be several weeks/months in the future i.s.o many week/months. I doubt if this will help in problem determination.

- How can the new policy be activated in a Sysplex on 'the next ipl' of one system, if other systems in the Sysplex still have a connection to the old CF and possibly to the logrec structure? Will the connections be forced? And will the CF be forced from the Sysplex?

- A solution could be to verify the new policy on activation and if conflicting situations are encountered, drop the activation i.s.o. leaving it pending. At that same moment, the customer is informed of the problem and can investigate what was wrong with the policy.

Thanks,
Kees.
-----Original Message-----
Behalf Of J Ellis
Sent: 01 August, 2017 16:26
Subject: APAR'd Re: What casues IPL/XCF to read the CFRM data set for
the policy ?
"APAR OA53531 is now open to address this condition.
Our change will likely to cause the pending policy to be activated upon
the next ipl rather than waiting till the last failed-persisten str to be
deleted from the to-be-deleted CF as encountered by this customer"
here's the explanation as near as we have concluded -
(note that we because of the way our consoles are configured and the way
we use a scripting language to 'one button' IPL from client
workstations, we will never know what caused the original IPL to fail)
in April we installed a new CEC, it IPL'd successfully using the the CEC's internal CF.
Shortly after the IPL a new policy with an incorrect serial number and
partion number was introduced via a SETXCF operator command.
this policy went PENDING and no one noticed for whatever reason.
it has been concluded that it went pending because the LOGREC structure
was marked persistent.
months go by with successful ipl's, with the 'bad' policy still pending
because of the logrec structure ...
come a Sunday morning when most of tech support is on vacation :-)
an IPL fails, and additional IPL's are attempted by operations, (we and
IBM are still researching the hardware logs to see if anyone can
determine original failure)
tech support is called and has the CF bounced, this clears the
connection issue, subsequent IPL fails because GRS can't find the CF,
because the bad policy is now current.
we IPL with GRS=NONE to get system up, see where the incorrect policy is
in place (looking at IXC messages from IPL)
write out correct policy and activate it.
re-ipl, all is good
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
********************************************************
For information, services and offers, please visit our web site: http://www.klm.com. This e-mail and any attachment may contain confidential and privileged material intended for the addressee only. If you are not the addressee, you are notified that no part of the e-mail or any attachment may be disclosed, copied or distributed, and that any other action related to this e-mail or attachment is strictly prohibited, and may be unlawful. If you have received this e-mail by error, please notify the sender immediately by return e-mail, and delete this message.

Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its employees shall not be liable for the incorrect or incomplete transmission of this e-mail or any attachments, nor responsible for any delay in receipt.
Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch Airlines) is registered in Amstelveen, The Netherlands, with registered number 33014286
********************************************************


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
J Ellis
2017-08-01 16:03:07 UTC
Reply
Permalink
Raw Message
Agree with all your comments. I have asked for an operator command that shows exactly what/why there is a pending condition. And especially a message at IPL time that something is wrong or there are inconsistencies -- what do you want to do now ?

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
van der Grijn, Bart , B
2017-08-01 17:29:21 UTC
Reply
Permalink
Raw Message
Why not a Healthcheck that would report on a pending policy?
Bart

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of J Ellis
Sent: Tuesday, August 01, 2017 12:04 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: Re: APAR'd Re: What casues IPL/XCF to read the CFRM data set for the policy ?

Agree with all your comments. I have asked for an operator command that shows exactly what/why there is a pending condition. And especially a message at IPL time that something is wrong or there are inconsistencies -- what do you want to do now ?


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Carmen Vitullo
2017-08-01 17:36:28 UTC
Reply
Permalink
Raw Message
I run the health checker in my test LPAR, last 2 companies I worked with....don't want it didn't like it...don't run it. I like the idea and have used the checks. but that's not a big enough hammer for most sites.


my 2 cents


Carmen

----- Original Message -----

From: "Bart van der Grijn (B)" <***@DOW.COM>
To: IBM-***@LISTSERV.UA.EDU
Sent: Tuesday, August 1, 2017 12:30:28 PM
Subject: Re: APAR'd Re: What casues IPL/XCF to read the CFRM data set for the policy ?

Why not a Healthcheck that would report on a pending policy?
Bart

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of J Ellis
Sent: Tuesday, August 01, 2017 12:04 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: Re: APAR'd Re: What casues IPL/XCF to read the CFRM data set for the policy ?

Agree with all your comments. I have asked for an operator command that shows exactly what/why there is a pending condition. And especially a message at IPL time that something is wrong or there are inconsistencies -- what do you want to do now ?


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
van der Grijn, Bart , B
2017-08-01 18:32:58 UTC
Reply
Permalink
Raw Message
I don't want to turn this into a HealthChecker thread, but I believe the hammer is as big as you make it. When a check trips it generates a console/syslog message. I assume most sites have some sort of method to escalate key messages. Same framework can be used for the Healthchecker messages you deem worthy.
It doesn't rely on the user recognizing the message at command time and keeps on nagging to assure you don't forget about it.

Bart

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of Carmen Vitullo
Sent: Tuesday, August 01, 2017 1:38 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: Re: APAR'd Re: What casues IPL/XCF to read the CFRM data set for the policy ?

I run the health checker in my test LPAR, last 2 companies I worked with....don't want it didn't like it...don't run it. I like the idea and have used the checks. but that's not a big enough hammer for most sites.


my 2 cents


Carmen

----- Original Message -----

From: "Bart van der Grijn (B)" <***@DOW.COM>
To: IBM-***@LISTSERV.UA.EDU
Sent: Tuesday, August 1, 2017 12:30:28 PM
Subject: Re: APAR'd Re: What casues IPL/XCF to read the CFRM data set for the policy ?

Why not a Healthcheck that would report on a pending policy?
Bart

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of J Ellis
Sent: Tuesday, August 01, 2017 12:04 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: Re: APAR'd Re: What casues IPL/XCF to read the CFRM data set for the policy ?

Agree with all your comments. I have asked for an operator command that shows exactly what/why there is a pending condition. And especially a message at IPL time that something is wrong or there are inconsistencies -- what do you want to do now ?


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Jesse 1 Robinson
2017-08-01 23:49:25 UTC
Reply
Permalink
Raw Message
I'm troubled by the suggestion that *anything* can be done at IPL time if a policy to be force-activated has incorrect serial and or partition number. If so, that policy is useless. If a workable policy was previously overlaid ('replaced') by a different policy of the *same name*, then it's brick wall time. The only resolution is what was actually done in this case. Bring up the system in GRS ring mode--avoiding dependence on a nonexistent GRS structure--in order to recreate and activate a usable policy.

You should not have to IPL again at that point as all move-to-sysplex steps that I can recall from 20 years ago (!) are dynamic. However, our automation policy is designed to work from IPL up, so that might be simpler in the long run.

Again, I can't urge strongly enough to use a new name for a new policy. If POLICYA and POLICYB are both stored in the CFRM data set, you can point to the last working policy at IPL time. If XCF finds more than one policy in CFRM, he could prompt to use a different policy than the one last in use. That's the only useful workaround I can imagine.

.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-543-6132 Office ⇐=== NEW
***@sce.com


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of J Ellis
Sent: Tuesday, August 01, 2017 9:04 AM
To: IBM-***@LISTSERV.UA.EDU
Subject: (External):Re: APAR'd Re: What casues IPL/XCF to read the CFRM data set for the policy ?

Agree with all your comments. I have asked for an operator command that shows exactly what/why there is a pending condition. And especially a message at IPL time that something is wrong or there are inconsistencies -- what do you want to do now ?


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Loading...