Message ID | 20180703061727.8789-15-hegdevasant@linux.vnet.ibm.com |
---|---|
State | Superseded |
Headers | show |
Series | MPIPL support | expand |
Vasant Hegde <hegdevasant@linux.vnet.ibm.com> writes: > Enhance reboot2 call to support FADUMP. Payload will call this interface > to initiate fadump. > > Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> > --- > core/platform.c | 4 ++++ > doc/opal-api/opal-cec-reboot-6-116.rst | 7 +++++++ > include/opal-api.h | 1 + > 3 files changed, 12 insertions(+) > > diff --git a/core/platform.c b/core/platform.c > index 3282227c1..08f05dc9b 100644 > --- a/core/platform.c > +++ b/core/platform.c > @@ -101,6 +101,10 @@ static int64_t opal_cec_reboot2(uint32_t reboot_type, char *diag) > case OPAL_REBOOT_FULL_IPL: > disable_fast_reboot("full IPL reboot requested"); > return opal_cec_reboot(); > + case OPAL_REBOOT_MPIPL: > + prlog(PR_ERR, "Kernel requested for fadump\n"); > + assert(false); > + break; > default: > prlog(PR_NOTICE, "OPAL: Unsupported reboot request %d\n", reboot_type); > return OPAL_UNSUPPORTED; > diff --git a/doc/opal-api/opal-cec-reboot-6-116.rst b/doc/opal-api/opal-cec-reboot-6-116.rst > index 516d4fc01..11cf54aa1 100644 > --- a/doc/opal-api/opal-cec-reboot-6-116.rst > +++ b/doc/opal-api/opal-cec-reboot-6-116.rst > @@ -63,6 +63,13 @@ OPAL_REBOOT_FULL_IPL = 2 > On platforms that don't support fast reboot, this is equivalent to a > normal reboot. > > +OPAL_REBOOT_MPIPL = 3 > + Request for fadump reboot. Firmware will reboot the system and collect > + dump. > + > + On platforms that don't support fadump, this is equivalent to a > + normal assert. I think a reboot type is the right way to go here, although I was thinking maybe something not tied to MPIPL itself but something a bit more generic.... I'm thinking along the lines of "should a dump be taken"... We have OPAL_REBOOT_PLATFORM_ERROR for when there's something gone wrong on a platform level (theoretically not the Operating System's fault), maybe we want OPAL_REBOOT_OS_ERROR ? With a dump being attempted for OPAL_REBOOT_PLATFORM_ERROR and OPAL_REBOOT_OS_ERROR reboot types? A bit of an unanswered question I have is if we should have a fast reboot path for gathering a dump, as I'm not sure there's much that would prevent us doing that?
On 08/07/2018 01:29 PM, Stewart Smith wrote: > Vasant Hegde <hegdevasant@linux.vnet.ibm.com> writes: >> Enhance reboot2 call to support FADUMP. Payload will call this interface >> to initiate fadump. >> >> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> >> --- >> core/platform.c | 4 ++++ >> doc/opal-api/opal-cec-reboot-6-116.rst | 7 +++++++ >> include/opal-api.h | 1 + >> 3 files changed, 12 insertions(+) >> >> diff --git a/core/platform.c b/core/platform.c >> index 3282227c1..08f05dc9b 100644 >> --- a/core/platform.c >> +++ b/core/platform.c >> @@ -101,6 +101,10 @@ static int64_t opal_cec_reboot2(uint32_t reboot_type, char *diag) >> case OPAL_REBOOT_FULL_IPL: >> disable_fast_reboot("full IPL reboot requested"); >> return opal_cec_reboot(); >> + case OPAL_REBOOT_MPIPL: >> + prlog(PR_ERR, "Kernel requested for fadump\n"); >> + assert(false); >> + break; >> default: >> prlog(PR_NOTICE, "OPAL: Unsupported reboot request %d\n", reboot_type); >> return OPAL_UNSUPPORTED; >> diff --git a/doc/opal-api/opal-cec-reboot-6-116.rst b/doc/opal-api/opal-cec-reboot-6-116.rst >> index 516d4fc01..11cf54aa1 100644 >> --- a/doc/opal-api/opal-cec-reboot-6-116.rst >> +++ b/doc/opal-api/opal-cec-reboot-6-116.rst >> @@ -63,6 +63,13 @@ OPAL_REBOOT_FULL_IPL = 2 >> On platforms that don't support fast reboot, this is equivalent to a >> normal reboot. >> >> +OPAL_REBOOT_MPIPL = 3 >> + Request for fadump reboot. Firmware will reboot the system and collect >> + dump. >> + >> + On platforms that don't support fadump, this is equivalent to a >> + normal assert. > > I think a reboot type is the right way to go here, although I was > thinking maybe something not tied to MPIPL itself but something a bit > more generic.... > > I'm thinking along the lines of "should a dump be taken"... We have > OPAL_REBOOT_PLATFORM_ERROR for when there's something gone wrong on a > platform level (theoretically not the Operating System's fault), maybe > we want OPAL_REBOOT_OS_ERROR ? Yes. OS_ERROR makes sense. Will fix it. > > With a dump being attempted for OPAL_REBOOT_PLATFORM_ERROR and > OPAL_REBOOT_OS_ERROR reboot types? By default we have disabled xscom_trigger_xstop() in P9. So OPAL_REBOOT_PLATFORM_ERROR will return to kernel and kernel will call panic path. I think we should plumb panic path to call OPAL_REBOOT_OS_ERROR. > > A bit of an unanswered question I have is if we should have a fast > reboot path for gathering a dump, as I'm not sure there's much that > would prevent us doing that? Nothing prevents us from doing it . But do we really need? My understanding is fast reboot is called in normal path (not when we hit some kernel/platform error). -Vasant
Vasant Hegde <hegdevasant@linux.vnet.ibm.com> writes: > On 08/07/2018 01:29 PM, Stewart Smith wrote: >> Vasant Hegde <hegdevasant@linux.vnet.ibm.com> writes: >>> Enhance reboot2 call to support FADUMP. Payload will call this interface >>> to initiate fadump. >>> >>> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> >>> --- >>> core/platform.c | 4 ++++ >>> doc/opal-api/opal-cec-reboot-6-116.rst | 7 +++++++ >>> include/opal-api.h | 1 + >>> 3 files changed, 12 insertions(+) >>> >>> diff --git a/core/platform.c b/core/platform.c >>> index 3282227c1..08f05dc9b 100644 >>> --- a/core/platform.c >>> +++ b/core/platform.c >>> @@ -101,6 +101,10 @@ static int64_t opal_cec_reboot2(uint32_t reboot_type, char *diag) >>> case OPAL_REBOOT_FULL_IPL: >>> disable_fast_reboot("full IPL reboot requested"); >>> return opal_cec_reboot(); >>> + case OPAL_REBOOT_MPIPL: >>> + prlog(PR_ERR, "Kernel requested for fadump\n"); >>> + assert(false); >>> + break; >>> default: >>> prlog(PR_NOTICE, "OPAL: Unsupported reboot request %d\n", reboot_type); >>> return OPAL_UNSUPPORTED; >>> diff --git a/doc/opal-api/opal-cec-reboot-6-116.rst b/doc/opal-api/opal-cec-reboot-6-116.rst >>> index 516d4fc01..11cf54aa1 100644 >>> --- a/doc/opal-api/opal-cec-reboot-6-116.rst >>> +++ b/doc/opal-api/opal-cec-reboot-6-116.rst >>> @@ -63,6 +63,13 @@ OPAL_REBOOT_FULL_IPL = 2 >>> On platforms that don't support fast reboot, this is equivalent to a >>> normal reboot. >>> >>> +OPAL_REBOOT_MPIPL = 3 >>> + Request for fadump reboot. Firmware will reboot the system and collect >>> + dump. >>> + >>> + On platforms that don't support fadump, this is equivalent to a >>> + normal assert. >> >> I think a reboot type is the right way to go here, although I was >> thinking maybe something not tied to MPIPL itself but something a bit >> more generic.... >> >> I'm thinking along the lines of "should a dump be taken"... We have >> OPAL_REBOOT_PLATFORM_ERROR for when there's something gone wrong on a >> platform level (theoretically not the Operating System's fault), maybe >> we want OPAL_REBOOT_OS_ERROR ? > > Yes. OS_ERROR makes sense. Will fix it. > >> >> With a dump being attempted for OPAL_REBOOT_PLATFORM_ERROR and >> OPAL_REBOOT_OS_ERROR reboot types? > > By default we have disabled xscom_trigger_xstop() in P9. So > OPAL_REBOOT_PLATFORM_ERROR > will return to kernel and kernel will call panic path. Yeah, I think we could at the least re-enable it and map it to MPIPL (at least for the time being)? At least then there'd be some kind of data capture of what went wrong rather than our current gathering of absolutely nothing. > I think we should plumb panic path to call OPAL_REBOOT_OS_ERROR. Yes, agree. >> A bit of an unanswered question I have is if we should have a fast >> reboot path for gathering a dump, as I'm not sure there's much that >> would prevent us doing that? > > Nothing prevents us from doing it . But do we really need? My understanding is > fast reboot is called in normal path (not when we hit some > kernel/platform error). We'll do a fast reboot even in the event of panic() as it's just hooked up to the normal reboot path. We do disable fast reboot in some paths though, namely: unrecoverable HMI, platform error, a bunch of internal OPAL problems etc.
On 08/17/2018 10:05 AM, Stewart Smith wrote: > Vasant Hegde <hegdevasant@linux.vnet.ibm.com> writes: >> On 08/07/2018 01:29 PM, Stewart Smith wrote: >>> Vasant Hegde <hegdevasant@linux.vnet.ibm.com> writes: >>>> Enhance reboot2 call to support FADUMP. Payload will call this interface >>>> to initiate fadump. >>>> >>>> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> >>>> --- >>>> core/platform.c | 4 ++++ >>>> doc/opal-api/opal-cec-reboot-6-116.rst | 7 +++++++ >>>> include/opal-api.h | 1 + >>>> 3 files changed, 12 insertions(+) >>>> >>>> diff --git a/core/platform.c b/core/platform.c >>>> index 3282227c1..08f05dc9b 100644 >>>> --- a/core/platform.c >>>> +++ b/core/platform.c >>>> @@ -101,6 +101,10 @@ static int64_t opal_cec_reboot2(uint32_t reboot_type, char *diag) >>>> case OPAL_REBOOT_FULL_IPL: >>>> disable_fast_reboot("full IPL reboot requested"); >>>> return opal_cec_reboot(); >>>> + case OPAL_REBOOT_MPIPL: >>>> + prlog(PR_ERR, "Kernel requested for fadump\n"); >>>> + assert(false); >>>> + break; >>>> default: >>>> prlog(PR_NOTICE, "OPAL: Unsupported reboot request %d\n", reboot_type); >>>> return OPAL_UNSUPPORTED; >>>> diff --git a/doc/opal-api/opal-cec-reboot-6-116.rst b/doc/opal-api/opal-cec-reboot-6-116.rst >>>> index 516d4fc01..11cf54aa1 100644 >>>> --- a/doc/opal-api/opal-cec-reboot-6-116.rst >>>> +++ b/doc/opal-api/opal-cec-reboot-6-116.rst >>>> @@ -63,6 +63,13 @@ OPAL_REBOOT_FULL_IPL = 2 >>>> On platforms that don't support fast reboot, this is equivalent to a >>>> normal reboot. >>>> >>>> +OPAL_REBOOT_MPIPL = 3 >>>> + Request for fadump reboot. Firmware will reboot the system and collect >>>> + dump. >>>> + >>>> + On platforms that don't support fadump, this is equivalent to a >>>> + normal assert. >>> >>> I think a reboot type is the right way to go here, although I was >>> thinking maybe something not tied to MPIPL itself but something a bit >>> more generic.... >>> >>> I'm thinking along the lines of "should a dump be taken"... We have >>> OPAL_REBOOT_PLATFORM_ERROR for when there's something gone wrong on a >>> platform level (theoretically not the Operating System's fault), maybe >>> we want OPAL_REBOOT_OS_ERROR ? >> >> Yes. OS_ERROR makes sense. Will fix it. >> >>> >>> With a dump being attempted for OPAL_REBOOT_PLATFORM_ERROR and >>> OPAL_REBOOT_OS_ERROR reboot types? >> >> By default we have disabled xscom_trigger_xstop() in P9. So >> OPAL_REBOOT_PLATFORM_ERROR >> will return to kernel and kernel will call panic path. > > Yeah, I think we could at the least re-enable it and map it to MPIPL (at > least for the time being)? Yes.. We can do that.. I will enable this patch later (once base patches merged). > > At least then there'd be some kind of data capture of what went wrong > rather than our current gathering of absolutely nothing. Agree. > >> I think we should plumb panic path to call OPAL_REBOOT_OS_ERROR. > > Yes, agree. > >>> A bit of an unanswered question I have is if we should have a fast >>> reboot path for gathering a dump, as I'm not sure there's much that >>> would prevent us doing that? >> >> Nothing prevents us from doing it . But do we really need? My understanding is >> fast reboot is called in normal path (not when we hit some >> kernel/platform error). > > We'll do a fast reboot even in the event of panic() as it's just hooked > up to the normal reboot path. We do disable fast reboot in some paths > though, namely: unrecoverable HMI, platform error, a bunch of internal > OPAL problems etc. Hmmm. may be we should plumb panic() path to call MPIPL. Will look into it later. -Vasant
diff --git a/core/platform.c b/core/platform.c index 3282227c1..08f05dc9b 100644 --- a/core/platform.c +++ b/core/platform.c @@ -101,6 +101,10 @@ static int64_t opal_cec_reboot2(uint32_t reboot_type, char *diag) case OPAL_REBOOT_FULL_IPL: disable_fast_reboot("full IPL reboot requested"); return opal_cec_reboot(); + case OPAL_REBOOT_MPIPL: + prlog(PR_ERR, "Kernel requested for fadump\n"); + assert(false); + break; default: prlog(PR_NOTICE, "OPAL: Unsupported reboot request %d\n", reboot_type); return OPAL_UNSUPPORTED; diff --git a/doc/opal-api/opal-cec-reboot-6-116.rst b/doc/opal-api/opal-cec-reboot-6-116.rst index 516d4fc01..11cf54aa1 100644 --- a/doc/opal-api/opal-cec-reboot-6-116.rst +++ b/doc/opal-api/opal-cec-reboot-6-116.rst @@ -63,6 +63,13 @@ OPAL_REBOOT_FULL_IPL = 2 On platforms that don't support fast reboot, this is equivalent to a normal reboot. +OPAL_REBOOT_MPIPL = 3 + Request for fadump reboot. Firmware will reboot the system and collect + dump. + + On platforms that don't support fadump, this is equivalent to a + normal assert. + Unsupported Reboot type For unsupported reboot type, this function will return with OPAL_UNSUPPORTED and no reboot will be triggered. diff --git a/include/opal-api.h b/include/opal-api.h index a099b0250..b7d4fac49 100644 --- a/include/opal-api.h +++ b/include/opal-api.h @@ -1233,6 +1233,7 @@ enum { OPAL_REBOOT_NORMAL = 0, OPAL_REBOOT_PLATFORM_ERROR, OPAL_REBOOT_FULL_IPL, + OPAL_REBOOT_MPIPL, }; /* Argument to OPAL_PCI_TCE_KILL */
Enhance reboot2 call to support FADUMP. Payload will call this interface to initiate fadump. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> --- core/platform.c | 4 ++++ doc/opal-api/opal-cec-reboot-6-116.rst | 7 +++++++ include/opal-api.h | 1 + 3 files changed, 12 insertions(+)