Message ID | 20180326150233.23089-1-npiggin@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 741de617661794246f84a21a02fc5e327bffc9ad |
Headers | show |
Series | powerpc/powernv/nvram: opal_nvram_write handle unknown OPAL errors | expand |
On 03/26/2018 08:32 PM, Nicholas Piggin wrote: > opal_nvram_write currently just assumes success if it encounters an > error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO > on other errors instead. > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > --- > arch/powerpc/platforms/powernv/opal-nvram.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/powerpc/platforms/powernv/opal-nvram.c b/arch/powerpc/platforms/powernv/opal-nvram.c > index 9db4398ded5d..13bf625dc3e8 100644 > --- a/arch/powerpc/platforms/powernv/opal-nvram.c > +++ b/arch/powerpc/platforms/powernv/opal-nvram.c > @@ -59,6 +59,8 @@ static ssize_t opal_nvram_write(char *buf, size_t count, loff_t *index) > if (rc == OPAL_BUSY_EVENT) > opal_poll_events(NULL); Current code does continuous poller here. May be we have small breathing time here. What you say? > } > + if (rc) > + return -EIO; Good catch. Thanks! Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> -Vasant
On Tue, 27 Mar 2018 12:47:31 +0530 Vasant Hegde <hegdevasant@linux.vnet.ibm.com> wrote: > On 03/26/2018 08:32 PM, Nicholas Piggin wrote: > > opal_nvram_write currently just assumes success if it encounters an > > error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO > > on other errors instead. > > > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > > --- > > arch/powerpc/platforms/powernv/opal-nvram.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/arch/powerpc/platforms/powernv/opal-nvram.c b/arch/powerpc/platforms/powernv/opal-nvram.c > > index 9db4398ded5d..13bf625dc3e8 100644 > > --- a/arch/powerpc/platforms/powernv/opal-nvram.c > > +++ b/arch/powerpc/platforms/powernv/opal-nvram.c > > @@ -59,6 +59,8 @@ static ssize_t opal_nvram_write(char *buf, size_t count, loff_t *index) > > if (rc == OPAL_BUSY_EVENT) > > opal_poll_events(NULL); > > Current code does continuous poller here. May be we have small breathing time > here. What you say? Yeah that's probably not a bad idea. I cc'ed skiboot list -- what's a reasonable delay between retries? Linux has a bunch of similar kind of loops if you grep for opal_poll_events and OPAL_BUSY. It would be good to convert them all to a standard form with a standard delay as recommended by OPAL, and where specific calls have different delay for a good reason, that would be documented in the OPAL API docs. Thanks, Nick
Nicholas Piggin <npiggin@gmail.com> writes: > opal_nvram_write currently just assumes success if it encounters an > error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO > on other errors instead. Does that ever happen with current skiboot? Even if it doesn't I think I'm inclined to tag this for stable. cheers > diff --git a/arch/powerpc/platforms/powernv/opal-nvram.c b/arch/powerpc/platforms/powernv/opal-nvram.c > index 9db4398ded5d..13bf625dc3e8 100644 > --- a/arch/powerpc/platforms/powernv/opal-nvram.c > +++ b/arch/powerpc/platforms/powernv/opal-nvram.c > @@ -59,6 +59,8 @@ static ssize_t opal_nvram_write(char *buf, size_t count, loff_t *index) > if (rc == OPAL_BUSY_EVENT) > opal_poll_events(NULL); > } > + if (rc) > + return -EIO; > *index += count; > return count; > } > -- > 2.16.1
On Tue, 27 Mar 2018 23:13:00 +1100 Michael Ellerman <mpe@ellerman.id.au> wrote: > Nicholas Piggin <npiggin@gmail.com> writes: > > > opal_nvram_write currently just assumes success if it encounters an > > error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO > > on other errors instead. > > Does that ever happen with current skiboot? I can now even using the mambo fake flash driver that never returns failure, because skiboot will return OPAL_INTERNAL_ERROR if we try to re-enter it. So I hit it when testing sreset-in-opal cases (the crash path wants to write something to nvram). Not sure about the skiboot flash layer. Aside from programming errors, it looks like perhaps ECC and BMC failure or unresponsive could cause errors to come back here. > Even if it doesn't I think I'm inclined to tag this for stable. It's turning some relatively minor types of errors into a system hang, so it seems like it could go in stable. I've hit the -EIO case in these basic tests and it hasn't had obvious bugs. Thanks, Nick
unscribed me On Tuesday, March 27, 2018 05:34:30 AM PDT, Michael Ellerman <mpe@ellerman.id.au> wrote: Nicholas Piggin <npiggin@gmail.com> writes: > opal_nvram_write currently just assumes success if it encounters an > error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO > on other errors instead. Does that ever happen with current skiboot? Even if it doesn't I think I'm inclined to tag this for stable. cheers > diff --git a/arch/powerpc/platforms/powernv/opal-nvram.c b/arch/powerpc/platforms/powernv/opal-nvram.c > index 9db4398ded5d..13bf625dc3e8 100644 > --- a/arch/powerpc/platforms/powernv/opal-nvram.c > +++ b/arch/powerpc/platforms/powernv/opal-nvram.c > @@ -59,6 +59,8 @@ static ssize_t opal_nvram_write(char *buf, size_t count, loff_t *index) > if (rc == OPAL_BUSY_EVENT) > opal_poll_events(NULL); > } > + if (rc) > + return -EIO; > *index += count; > return count; > } > -- > 2.16.1 <html><head></head><body><div style="font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:13px;"><div></div> <div>unscribed me <br></div><div><br></div> <div class="yahoo_quoted" id="yahoo_quoted_2267153160"> <div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;"> <div> On Tuesday, March 27, 2018 05:34:30 AM PDT, Michael Ellerman <mpe@ellerman.id.au> wrote: </div> <div><br></div> <div><br></div> <div><div dir="ltr">Nicholas Piggin <<a href="mailto:npiggin@gmail.com" ymailto="mailto:npiggin@gmail.com">npiggin@gmail.com</a>> writes:<br></div><div dir="ltr"><br></div><div dir="ltr">> opal_nvram_write currently just assumes success if it encounters an<br></div><div dir="ltr">> error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO<br></div><div dir="ltr">> on other errors instead.<br></div><div dir="ltr"><br></div><div dir="ltr">Does that ever happen with current skiboot?<br></div><div dir="ltr"><br></div><div dir="ltr">Even if it doesn't I think I'm inclined to tag this for stable.<br></div><div dir="ltr"><br></div><div dir="ltr">cheers<br></div><div dir="ltr"><br></div><div dir="ltr">> diff --git a/arch/powerpc/platforms/powernv/opal-nvram.c b/arch/powerpc/platforms/powernv/opal-nvram.c<br></div><div dir="ltr">> index 9db4398ded5d..13bf625dc3e8 100644<br></div><div dir="ltr">> --- a/arch/powerpc/platforms/powernv/opal-nvram.c<br></div><div dir="ltr">> +++ b/arch/powerpc/platforms/powernv/opal-nvram.c<br></div><div dir="ltr">> @@ -59,6 +59,8 @@ static ssize_t opal_nvram_write(char *buf, size_t count, loff_t *index)<br></div><div dir="ltr">> if (rc == OPAL_BUSY_EVENT)<br></div><div dir="ltr">> opal_poll_events(NULL);<br></div><div dir="ltr">> }<br></div><div dir="ltr">> + if (rc)<br></div><div dir="ltr">> + return -EIO;<br></div><div dir="ltr">> *index += count;<br></div><div dir="ltr">> return count;<br></div><div dir="ltr">> }<br></div><div dir="ltr">> -- <br></div><div dir="ltr">> 2.16.1<br></div></div> </div> </div></div></body></html>
On 03/27/2018 01:08 PM, Nicholas Piggin wrote: > On Tue, 27 Mar 2018 12:47:31 +0530 > Vasant Hegde <hegdevasant@linux.vnet.ibm.com> wrote: > >> On 03/26/2018 08:32 PM, Nicholas Piggin wrote: >>> opal_nvram_write currently just assumes success if it encounters an >>> error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO >>> on other errors instead. >>> >>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> >>> --- >>> arch/powerpc/platforms/powernv/opal-nvram.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/arch/powerpc/platforms/powernv/opal-nvram.c b/arch/powerpc/platforms/powernv/opal-nvram.c >>> index 9db4398ded5d..13bf625dc3e8 100644 >>> --- a/arch/powerpc/platforms/powernv/opal-nvram.c >>> +++ b/arch/powerpc/platforms/powernv/opal-nvram.c >>> @@ -59,6 +59,8 @@ static ssize_t opal_nvram_write(char *buf, size_t count, loff_t *index) >>> if (rc == OPAL_BUSY_EVENT) >>> opal_poll_events(NULL); >> >> Current code does continuous poller here. May be we have small breathing time >> here. What you say? > > Yeah that's probably not a bad idea. I cc'ed skiboot list -- what's a > reasonable delay between retries? I think it depends on individual API. Like in case of dump retrival I've 20ms delay... as FSP takes time to copy data to host memory. But may be here we can have smaller delay. > Linux has a bunch of similar kind of > loops if you grep for opal_poll_events and OPAL_BUSY. It would be good > to convert them all to a standard form with a standard delay as > recommended by OPAL, and where specific calls have different delay > for a good reason, that would be documented in the OPAL API docs. Yes. We should update API documentation. -Vasant
Nicholas Piggin <npiggin@gmail.com> writes: > opal_nvram_write currently just assumes success if it encounters an > error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO > on other errors instead. > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > --- > arch/powerpc/platforms/powernv/opal-nvram.c | 2 ++ > 1 file changed, 2 insertions(+) Acked-by: Stewart Smith <stewart@linux.ibm.com>
On Mon, 2018-03-26 at 15:02:33 UTC, Nicholas Piggin wrote: > opal_nvram_write currently just assumes success if it encounters an > error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO > on other errors instead. > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> > Acked-by: Stewart Smith <stewart@linux.ibm.com> Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/741de617661794246f84a21a02fc5e cheers
diff --git a/arch/powerpc/platforms/powernv/opal-nvram.c b/arch/powerpc/platforms/powernv/opal-nvram.c index 9db4398ded5d..13bf625dc3e8 100644 --- a/arch/powerpc/platforms/powernv/opal-nvram.c +++ b/arch/powerpc/platforms/powernv/opal-nvram.c @@ -59,6 +59,8 @@ static ssize_t opal_nvram_write(char *buf, size_t count, loff_t *index) if (rc == OPAL_BUSY_EVENT) opal_poll_events(NULL); } + if (rc) + return -EIO; *index += count; return count; }
opal_nvram_write currently just assumes success if it encounters an error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO on other errors instead. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- arch/powerpc/platforms/powernv/opal-nvram.c | 2 ++ 1 file changed, 2 insertions(+)