Message ID | 20130710130155.4993.61577.stgit@mars (mailing list archive) |
---|---|
State | Accepted, archived |
Commit | ee1dd1e3dc774cf257012215d996e8e7e370c162 |
Headers | show |
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> writes: > From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> > > During Machine Check interrupt on pseries platform, R3 generally points to > memory region inside RTAS (FWNMI) area. We see r3 corruption because when RTAS > delivers the machine check exception it passes the address inside FWNMI area > with the top most bit set. This patch fixes this issue by masking top two bit > in machine check exception handler. I always got that error and used to wonder why I find FWNMI corrupt. IS this a rtas bug or is it documented in papr ? > > Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> > --- > arch/powerpc/platforms/pseries/ras.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c > index 7b3cbde..721c058 100644 > --- a/arch/powerpc/platforms/pseries/ras.c > +++ b/arch/powerpc/platforms/pseries/ras.c > @@ -287,6 +287,9 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) > unsigned long *savep; > struct rtas_error_log *h, *errhdr = NULL; > > + /* Mask top two bits */ > + regs->gpr[3] &= ~(0x3UL << 62); > + > if (!VALID_FWNMI_BUFFER(regs->gpr[3])) { > printk(KERN_ERR "FWNMI: corrupt r3 0x%016lx\n", regs->gpr[3]); > return NULL; > -aneesh
On 07/10/2013 07:41 PM, Aneesh Kumar K.V wrote: > Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> writes: > >> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> >> >> During Machine Check interrupt on pseries platform, R3 generally points to >> memory region inside RTAS (FWNMI) area. We see r3 corruption because when RTAS >> delivers the machine check exception it passes the address inside FWNMI area >> with the top most bit set. This patch fixes this issue by masking top two bit >> in machine check exception handler. > > I always got that error and used to wonder why I find FWNMI > corrupt. IS this a rtas bug or is it documented in papr ? Nope. There is no mention of it in PAPR. It looks like a bug in RTAS. > > >> >> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> >> --- >> arch/powerpc/platforms/pseries/ras.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c >> index 7b3cbde..721c058 100644 >> --- a/arch/powerpc/platforms/pseries/ras.c >> +++ b/arch/powerpc/platforms/pseries/ras.c >> @@ -287,6 +287,9 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) >> unsigned long *savep; >> struct rtas_error_log *h, *errhdr = NULL; >> >> + /* Mask top two bits */ >> + regs->gpr[3] &= ~(0x3UL << 62); >> + >> if (!VALID_FWNMI_BUFFER(regs->gpr[3])) { >> printk(KERN_ERR "FWNMI: corrupt r3 0x%016lx\n", regs->gpr[3]); >> return NULL; >> > > -aneesh > > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev >
On Thu, 2013-07-11 at 10:04 +0530, Mahesh Jagannath Salgaonkar wrote: > > I always got that error and used to wonder why I find FWNMI > > corrupt. IS this a rtas bug or is it documented in papr ? > > Nope. There is no mention of it in PAPR. It looks like a bug in RTAS. Typically, the top bit in real mode means to ignore the HRMOR... I think it's some old bug in RTAS indeed. Cheers, Ben.
On 07/10/2013 06:32 PM, Mahesh J Salgaonkar wrote: > From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> > > During Machine Check interrupt on pseries platform, R3 generally points to > memory region inside RTAS (FWNMI) area. We see r3 corruption because when RTAS > delivers the machine check exception it passes the address inside FWNMI area > with the top most bit set. This patch fixes this issue by masking top two bit > in machine check exception handler. > > Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> > --- > arch/powerpc/platforms/pseries/ras.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c > index 7b3cbde..721c058 100644 > --- a/arch/powerpc/platforms/pseries/ras.c > +++ b/arch/powerpc/platforms/pseries/ras.c > @@ -287,6 +287,9 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) > unsigned long *savep; > struct rtas_error_log *h, *errhdr = NULL; > > + /* Mask top two bits */ > + regs->gpr[3] &= ~(0x3UL << 62); We need to replace this "62" with a shift macro specifying the significance of these top two address bits in the real mode.
Anshuman Khandual <khandual@linux.vnet.ibm.com> writes: > On 07/10/2013 06:32 PM, Mahesh J Salgaonkar wrote: >> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> >> >> During Machine Check interrupt on pseries platform, R3 generally points to >> memory region inside RTAS (FWNMI) area. We see r3 corruption because when RTAS >> delivers the machine check exception it passes the address inside FWNMI area >> with the top most bit set. This patch fixes this issue by masking top two bit >> in machine check exception handler. >> >> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> >> --- >> arch/powerpc/platforms/pseries/ras.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c >> index 7b3cbde..721c058 100644 >> --- a/arch/powerpc/platforms/pseries/ras.c >> +++ b/arch/powerpc/platforms/pseries/ras.c >> @@ -287,6 +287,9 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) >> unsigned long *savep; >> struct rtas_error_log *h, *errhdr = NULL; >> >> + /* Mask top two bits */ >> + regs->gpr[3] &= ~(0x3UL << 62); > > We need to replace this "62" with a shift macro specifying the significance > of these top two address bits in the real mode. huh?? (gdb) p/t 0x3ull << 62 $4 = 1100000000000000000000000000000000000000000000000000000000000000 Why you need an extra comment to explain 62. But yes, we can possibly write that this is an RTAS bug -aneesh
On 07/15/2013 11:36 AM, Aneesh Kumar K.V wrote: > Anshuman Khandual <khandual@linux.vnet.ibm.com> writes: > >> On 07/10/2013 06:32 PM, Mahesh J Salgaonkar wrote: >>> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> >>> >>> During Machine Check interrupt on pseries platform, R3 generally points to >>> memory region inside RTAS (FWNMI) area. We see r3 corruption because when RTAS >>> delivers the machine check exception it passes the address inside FWNMI area >>> with the top most bit set. This patch fixes this issue by masking top two bit >>> in machine check exception handler. >>> >>> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> >>> --- >>> arch/powerpc/platforms/pseries/ras.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c >>> index 7b3cbde..721c058 100644 >>> --- a/arch/powerpc/platforms/pseries/ras.c >>> +++ b/arch/powerpc/platforms/pseries/ras.c >>> @@ -287,6 +287,9 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) >>> unsigned long *savep; >>> struct rtas_error_log *h, *errhdr = NULL; >>> >>> + /* Mask top two bits */ >>> + regs->gpr[3] &= ~(0x3UL << 62); >> >> We need to replace this "62" with a shift macro specifying the significance >> of these top two address bits in the real mode. > > huh?? > > (gdb) p/t 0x3ull << 62 > $4 = 1100000000000000000000000000000000000000000000000000000000000000 > > Why you need an extra comment to explain 62. But yes, we can possibly > write that this is an RTAS bug 62 was just to point at the top two address bits in the real mode. Extra comment request was to specify what is the RTAS behaviour or bug with respect to those top two bits and how we are dealing with them here in this fix.
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 7b3cbde..721c058 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -287,6 +287,9 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) unsigned long *savep; struct rtas_error_log *h, *errhdr = NULL; + /* Mask top two bits */ + regs->gpr[3] &= ~(0x3UL << 62); + if (!VALID_FWNMI_BUFFER(regs->gpr[3])) { printk(KERN_ERR "FWNMI: corrupt r3 0x%016lx\n", regs->gpr[3]); return NULL;