diff mbox series

[2/2] errorlog: Increase the severity of abnormal reboot events

Message ID 20200304085807.10699-2-hegdevasant@linux.vnet.ibm.com
State Accepted
Headers show
Series [1/2] eSEL: Make sure PANIC logs are sent to BMC before calling assert | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch master (82aed17a5468aff6b600ee1694a10a60f942c018)
snowpatch_ozlabs/snowpatch_job_snowpatch-skiboot success Test snowpatch/job/snowpatch-skiboot on branch master
snowpatch_ozlabs/snowpatch_job_snowpatch-skiboot-dco success Signed-off-by present

Commit Message

Vasant Hegde March 4, 2020, 8:58 a.m. UTC
CC: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
---
 core/platform.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Klaus Heinrich Kiwi March 5, 2020, 1:33 p.m. UTC | #1
On 3/4/2020 5:58 AM, Vasant Hegde wrote:
> CC: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
> ---
>   core/platform.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/core/platform.c b/core/platform.c
> index 9593f3af2..85b2cd24f 100644
> --- a/core/platform.c
> +++ b/core/platform.c
> @@ -27,7 +27,7 @@ bool manufacturing_mode = false;
>   struct platform	platform;
> 
>   DEFINE_LOG_ENTRY(OPAL_RC_ABNORMAL_REBOOT, OPAL_PLATFORM_ERR_EVT, OPAL_CEC,
> -		 OPAL_CEC_HARDWARE, OPAL_PREDICTIVE_ERR_FAULT_RECTIFY_REBOOT,
> +		 OPAL_CEC_HARDWARE, OPAL_ERROR_PANIC,
>   		 OPAL_ABNORMAL_POWER_OFF);
> 
>   /*
> 

I think this is OK since OPAL_RC_ABNORMAL_REBOOT always a result of a 
OPAL_CEC_REBOOT2 OPAL call which itself causes a OPAL TI checkstop.

The fact that we respond to a "Reboot requested" with a checkstop/PANIC 
is pre-existing to this patch, so it feels like setting the record straight.

Reviewed-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com>

I'm a bit curious, though, why we do this, because it looks like the 
platform is asking us to reboot, not to checkstop..

Thanks,

  -Klaus
Vasant Hegde March 9, 2020, 3:35 p.m. UTC | #2
On 3/5/20 7:03 PM, Klaus Heinrich Kiwi wrote:
> 
> 
> On 3/4/2020 5:58 AM, Vasant Hegde wrote:
>> CC: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
>> ---
>>   core/platform.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/core/platform.c b/core/platform.c
>> index 9593f3af2..85b2cd24f 100644
>> --- a/core/platform.c
>> +++ b/core/platform.c
>> @@ -27,7 +27,7 @@ bool manufacturing_mode = false;
>>   struct platform    platform;
>>
>>   DEFINE_LOG_ENTRY(OPAL_RC_ABNORMAL_REBOOT, OPAL_PLATFORM_ERR_EVT, OPAL_CEC,
>> -         OPAL_CEC_HARDWARE, OPAL_PREDICTIVE_ERR_FAULT_RECTIFY_REBOOT,
>> +         OPAL_CEC_HARDWARE, OPAL_ERROR_PANIC,
>>            OPAL_ABNORMAL_POWER_OFF);
>>
>>   /*
>>
> 
> I think this is OK since OPAL_RC_ABNORMAL_REBOOT always a result of a 
> OPAL_CEC_REBOOT2 OPAL call which itself causes a OPAL TI checkstop.
> 
> The fact that we respond to a "Reboot requested" with a checkstop/PANIC is 
> pre-existing to this patch, so it feels like setting the record straight.
> 
> Reviewed-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com>
> 
> I'm a bit curious, though, why we do this, because it looks like the platform is 
> asking us to reboot, not to checkstop..

We call checkstop so that OCC can collect debug data before rebooting.

-Vasant
Klaus Heinrich Kiwi March 9, 2020, 7:49 p.m. UTC | #3
On 3/9/2020 12:35 PM, Vasant Hegde wrote:
> On 3/5/20 7:03 PM, Klaus Heinrich Kiwi wrote:
>> I think this is OK since OPAL_RC_ABNORMAL_REBOOT always a result of a 
>> OPAL_CEC_REBOOT2 OPAL call which itself causes a OPAL TI checkstop.
>>
>> The fact that we respond to a "Reboot requested" with a 
>> checkstop/PANIC is pre-existing to this patch, so it feels like 
>> setting the record straight.
>>
>> Reviewed-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com>
>>
>> I'm a bit curious, though, why we do this, because it looks like the 
>> platform is asking us to reboot, not to checkstop..
> 
> We call checkstop so that OCC can collect debug data before rebooting.

Thanks! By the way, I should have rtfm here
https://open-power.github.io/skiboot/doc/opal-api/opal-cec-reboot-6-116.html#opal-cec-reboot2

Reviewed-by: Klaus Heinrich Kiwi <klaus@linux.vnet.ibm.com>
diff mbox series

Patch

diff --git a/core/platform.c b/core/platform.c
index 9593f3af2..85b2cd24f 100644
--- a/core/platform.c
+++ b/core/platform.c
@@ -27,7 +27,7 @@  bool manufacturing_mode = false;
 struct platform	platform;
 
 DEFINE_LOG_ENTRY(OPAL_RC_ABNORMAL_REBOOT, OPAL_PLATFORM_ERR_EVT, OPAL_CEC,
-		 OPAL_CEC_HARDWARE, OPAL_PREDICTIVE_ERR_FAULT_RECTIFY_REBOOT,
+		 OPAL_CEC_HARDWARE, OPAL_ERROR_PANIC,
 		 OPAL_ABNORMAL_POWER_OFF);
 
 /*