Message ID | 20160322101839.18759.14744.stgit@mars |
---|---|
State | Accepted |
Headers | show |
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> writes: > From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> > > The current code sends partial hmi event (4 * 64bits instead of > 5 * 64bits) to host. The last 64 bits contains chip id/pir info for > reporting checkstop events. This bug affects only checkstop events. > > Host console o/p without this patch: > > [ 305.628283] Fatal Hypervisor Maintenance interrupt [Not recovered] > [ 305.628341] Error detail: Malfunction Alert > [ 305.628388] HMER: 8040000000000000 > [ 305.628423] CPU PIR: 00000000 > [ 305.628458] [Unit: VSU] Logic core check stop > > > Host console o/p with this patch: > > [ 200.122883] Fatal Hypervisor Maintenance interrupt [Not recovered] > [ 200.122941] Error detail: Malfunction Alert > [ 200.122986] HMER: 8040000000000000 > [ 200.123021] CPU PIR: 000008e8 > [ 200.123055] [Unit: VSU] Logic core check stop > > Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> This looks like it should also go to stable too, right? As in, to 5.1.x and 5.2.x ?
On 03/31/2016 12:07 PM, Stewart Smith wrote: > Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> writes: >> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> >> >> The current code sends partial hmi event (4 * 64bits instead of >> 5 * 64bits) to host. The last 64 bits contains chip id/pir info for >> reporting checkstop events. This bug affects only checkstop events. >> >> Host console o/p without this patch: >> >> [ 305.628283] Fatal Hypervisor Maintenance interrupt [Not recovered] >> [ 305.628341] Error detail: Malfunction Alert >> [ 305.628388] HMER: 8040000000000000 >> [ 305.628423] CPU PIR: 00000000 >> [ 305.628458] [Unit: VSU] Logic core check stop >> >> >> Host console o/p with this patch: >> >> [ 200.122883] Fatal Hypervisor Maintenance interrupt [Not recovered] >> [ 200.122941] Error detail: Malfunction Alert >> [ 200.122986] HMER: 8040000000000000 >> [ 200.123021] CPU PIR: 000008e8 >> [ 200.123055] [Unit: VSU] Logic core check stop >> >> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> > > This looks like it should also go to stable too, right? As in, to 5.1.x > and 5.2.x ? > Yes.
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> writes: > From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> > > The current code sends partial hmi event (4 * 64bits instead of > 5 * 64bits) to host. The last 64 bits contains chip id/pir info for > reporting checkstop events. This bug affects only checkstop events. > > Host console o/p without this patch: > > [ 305.628283] Fatal Hypervisor Maintenance interrupt [Not recovered] > [ 305.628341] Error detail: Malfunction Alert > [ 305.628388] HMER: 8040000000000000 > [ 305.628423] CPU PIR: 00000000 > [ 305.628458] [Unit: VSU] Logic core check stop > > > Host console o/p with this patch: > > [ 200.122883] Fatal Hypervisor Maintenance interrupt [Not recovered] > [ 200.122941] Error detail: Malfunction Alert > [ 200.122986] HMER: 8040000000000000 > [ 200.123021] CPU PIR: 000008e8 > [ 200.123055] [Unit: VSU] Logic core check stop > > Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> > --- > core/hmi.c | 15 ++++++++------- > 1 file changed, 8 insertions(+), 7 deletions(-) Thanks, applied to: skiboot-5.1.x as of 2636009 skiboot-5.2.x as of d597168 skiboot master as of a56b9aa
diff --git a/core/hmi.c b/core/hmi.c index d2cca90..a934438 100644 --- a/core/hmi.c +++ b/core/hmi.c @@ -217,7 +217,7 @@ static struct lock hmi_lock = LOCK_UNLOCKED; static int queue_hmi_event(struct OpalHMIEvent *hmi_evt, int recover) { - uint64_t *hmi_data; + size_t num_params; /* Don't queue up event if recover == -1 */ if (recover == -1) @@ -230,16 +230,17 @@ static int queue_hmi_event(struct OpalHMIEvent *hmi_evt, int recover) hmi_evt->disposition = OpalHMI_DISPOSITION_NOT_RECOVERED; /* - * V2 of struct OpalHMIEvent is of (4 * 64 bits) size and well packed + * V2 of struct OpalHMIEvent is of (5 * 64 bits) size and well packed * structure. Hence use uint64_t pointer to pass entire structure - * using 4 params in generic message format. + * using 5 params in generic message format. Instead of hard coding + * num_params divide the struct size by 8 bytes to get exact + * num_params value. */ - hmi_data = (uint64_t *)hmi_evt; + num_params = ALIGN_UP(sizeof(*hmi_evt), sizeof(u64)) / sizeof(u64); /* queue up for delivery to host. */ - return opal_queue_msg(OPAL_MSG_HMI_EVT, NULL, NULL, - hmi_data[0], hmi_data[1], hmi_data[2], - hmi_data[3]); + return _opal_queue_msg(OPAL_MSG_HMI_EVT, NULL, NULL, + num_params, (uint64_t *)hmi_evt); } static int is_capp_recoverable(int chip_id)