diff mbox

OCC: Do not call occ_do_load if hostservice LID load is not complete

Message ID 20150309062651.2067.6529.stgit@localhost.localdomain
State Accepted
Headers show

Commit Message

Vasant Hegde March 9, 2015, 6:26 a.m. UTC
Commit 4db0c1e4f introduced occ_load_req queue. With that changes we queue
the occ load request if hostservice LID load is not complete. And we have
callback function (occ_poke_load_queue)...which takes care of calling
__occ_do_load().

But current code proceeds and calls __occ_do_load() after queueing....which
is not correct. So just return if we queue the occ load request.

Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
---
 hw/occ.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Vaidyanathan Srinivasan March 9, 2015, 6:39 a.m. UTC | #1
* Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [2015-03-09 11:56:53]:

> Commit 4db0c1e4f introduced occ_load_req queue. With that changes we queue
> the occ load request if hostservice LID load is not complete. And we have
> callback function (occ_poke_load_queue)...which takes care of calling
> __occ_do_load().
> 
> But current code proceeds and calls __occ_do_load() after queueing....which
> is not correct. So just return if we queue the occ load request.
> 
> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
> cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>

Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>


> ---
>  hw/occ.c |    4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/occ.c b/hw/occ.c
> index cc2d4b7..34d6de5 100644
> --- a/hw/occ.c
> +++ b/hw/occ.c
> @@ -458,8 +458,10 @@ static void occ_do_load(u8 scope, u32 dbob_id __unused, u32 seq_id)
>  	 * Check if hostservices lid caching is complete. If not, queue
>  	 * the load request.
>  	 */
> -	if (!hservices_lid_preload_complete())
> +	if (!hservices_lid_preload_complete()) {
>  		occ_queue_load(scope, dbob_id, seq_id);
> +		return;
> +	}

We should not proceed to call host services until the lid is
preloaded.

>  
>  	__occ_do_load(scope, dbob_id, seq_id);

This will get called from occ_poke_load_queue() after lid preload is
complete.

In addition to this fix, 

hservice_lid_load()
        if (list_empty(&hbrt_lid_list))	/* Should not happen */
		hservices_lid_preload(); <<, Change to abort()

If we come back via hostservices and find lid not preloaded, then we
abort. The preload should happen first and also be preserved for any
runtime reload.

--Vaidy
Ananth N Mavinakayanahalli March 9, 2015, 6:53 a.m. UTC | #2
On Mon, Mar 09, 2015 at 11:56:53AM +0530, Vasant Hegde wrote:
> Commit 4db0c1e4f introduced occ_load_req queue. With that changes we queue
> the occ load request if hostservice LID load is not complete. And we have
> callback function (occ_poke_load_queue)...which takes care of calling
> __occ_do_load().
> 
> But current code proceeds and calls __occ_do_load() after queueing....which
> is not correct. So just return if we queue the occ load request.
> 
> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
> cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>

OOPS! Good catch...

Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>

> ---
>  hw/occ.c |    4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/occ.c b/hw/occ.c
> index cc2d4b7..34d6de5 100644
> --- a/hw/occ.c
> +++ b/hw/occ.c
> @@ -458,8 +458,10 @@ static void occ_do_load(u8 scope, u32 dbob_id __unused, u32 seq_id)
>  	 * Check if hostservices lid caching is complete. If not, queue
>  	 * the load request.
>  	 */
> -	if (!hservices_lid_preload_complete())
> +	if (!hservices_lid_preload_complete()) {
>  		occ_queue_load(scope, dbob_id, seq_id);
> +		return;
> +	}
> 
>  	__occ_do_load(scope, dbob_id, seq_id);
>  }
Vasant Hegde March 9, 2015, 7:47 a.m. UTC | #3
On 03/09/2015 12:09 PM, Vaidyanathan Srinivasan wrote:
> * Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [2015-03-09 11:56:53]:
> 

.../...


>>  
>>  	__occ_do_load(scope, dbob_id, seq_id);
> 
> This will get called from occ_poke_load_queue() after lid preload is
> complete.
> 
> In addition to this fix, 
> 
> hservice_lid_load()
>         if (list_empty(&hbrt_lid_list))	/* Should not happen */
> 		hservices_lid_preload(); <<, Change to abort()

Vaidy,

So you don't want to try loading LID again here?

Also better return error instead of abort?

-Vasant

> 
> If we come back via hostservices and find lid not preloaded, then we
> abort. The preload should happen first and also be preserved for any
> runtime reload.
> 
> --Vaidy
>
Vaidyanathan Srinivasan March 9, 2015, 8:12 a.m. UTC | #4
* Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [2015-03-09 13:17:54]:

> On 03/09/2015 12:09 PM, Vaidyanathan Srinivasan wrote:
> > * Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [2015-03-09 11:56:53]:
> > 
> 
> .../...
> 
> 
> >>  
> >>  	__occ_do_load(scope, dbob_id, seq_id);
> > 
> > This will get called from occ_poke_load_queue() after lid preload is
> > complete.
> > 
> > In addition to this fix, 
> > 
> > hservice_lid_load()
> >         if (list_empty(&hbrt_lid_list))	/* Should not happen */
> > 		hservices_lid_preload(); <<, Change to abort()
> 
> Vaidy,
> 
> So you don't want to try loading LID again here?

No, this situation should never happen since you ensure that the lid
would have been preloaded before calling __occ_do_load().
 
> Also better return error instead of abort?

No, this will return error into host services and that will be hard to
debug.

If pre-load failed despite attempts to queue and callback host service
after preload completes, then our code is broken and we need to abort
and fix it.

If we continue to load lid at this time, then we are back to
square-one where other load request for capi or initrd can come
inbetween and get stuck on running poller with lock held.

--Vaidy
diff mbox

Patch

diff --git a/hw/occ.c b/hw/occ.c
index cc2d4b7..34d6de5 100644
--- a/hw/occ.c
+++ b/hw/occ.c
@@ -458,8 +458,10 @@  static void occ_do_load(u8 scope, u32 dbob_id __unused, u32 seq_id)
 	 * Check if hostservices lid caching is complete. If not, queue
 	 * the load request.
 	 */
-	if (!hservices_lid_preload_complete())
+	if (!hservices_lid_preload_complete()) {
 		occ_queue_load(scope, dbob_id, seq_id);
+		return;
+	}
 
 	__occ_do_load(scope, dbob_id, seq_id);
 }