Message ID | 20150309062651.2067.6529.stgit@localhost.localdomain |
---|---|
State | Accepted |
Headers | show |
* Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [2015-03-09 11:56:53]: > Commit 4db0c1e4f introduced occ_load_req queue. With that changes we queue > the occ load request if hostservice LID load is not complete. And we have > callback function (occ_poke_load_queue)...which takes care of calling > __occ_do_load(). > > But current code proceeds and calls __occ_do_load() after queueing....which > is not correct. So just return if we queue the occ load request. > > Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> > cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> > --- > hw/occ.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/hw/occ.c b/hw/occ.c > index cc2d4b7..34d6de5 100644 > --- a/hw/occ.c > +++ b/hw/occ.c > @@ -458,8 +458,10 @@ static void occ_do_load(u8 scope, u32 dbob_id __unused, u32 seq_id) > * Check if hostservices lid caching is complete. If not, queue > * the load request. > */ > - if (!hservices_lid_preload_complete()) > + if (!hservices_lid_preload_complete()) { > occ_queue_load(scope, dbob_id, seq_id); > + return; > + } We should not proceed to call host services until the lid is preloaded. > > __occ_do_load(scope, dbob_id, seq_id); This will get called from occ_poke_load_queue() after lid preload is complete. In addition to this fix, hservice_lid_load() if (list_empty(&hbrt_lid_list)) /* Should not happen */ hservices_lid_preload(); <<, Change to abort() If we come back via hostservices and find lid not preloaded, then we abort. The preload should happen first and also be preserved for any runtime reload. --Vaidy
On Mon, Mar 09, 2015 at 11:56:53AM +0530, Vasant Hegde wrote: > Commit 4db0c1e4f introduced occ_load_req queue. With that changes we queue > the occ load request if hostservice LID load is not complete. And we have > callback function (occ_poke_load_queue)...which takes care of calling > __occ_do_load(). > > But current code proceeds and calls __occ_do_load() after queueing....which > is not correct. So just return if we queue the occ load request. > > Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> > cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> OOPS! Good catch... Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> > --- > hw/occ.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/hw/occ.c b/hw/occ.c > index cc2d4b7..34d6de5 100644 > --- a/hw/occ.c > +++ b/hw/occ.c > @@ -458,8 +458,10 @@ static void occ_do_load(u8 scope, u32 dbob_id __unused, u32 seq_id) > * Check if hostservices lid caching is complete. If not, queue > * the load request. > */ > - if (!hservices_lid_preload_complete()) > + if (!hservices_lid_preload_complete()) { > occ_queue_load(scope, dbob_id, seq_id); > + return; > + } > > __occ_do_load(scope, dbob_id, seq_id); > }
On 03/09/2015 12:09 PM, Vaidyanathan Srinivasan wrote: > * Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [2015-03-09 11:56:53]: > .../... >> >> __occ_do_load(scope, dbob_id, seq_id); > > This will get called from occ_poke_load_queue() after lid preload is > complete. > > In addition to this fix, > > hservice_lid_load() > if (list_empty(&hbrt_lid_list)) /* Should not happen */ > hservices_lid_preload(); <<, Change to abort() Vaidy, So you don't want to try loading LID again here? Also better return error instead of abort? -Vasant > > If we come back via hostservices and find lid not preloaded, then we > abort. The preload should happen first and also be preserved for any > runtime reload. > > --Vaidy >
* Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [2015-03-09 13:17:54]: > On 03/09/2015 12:09 PM, Vaidyanathan Srinivasan wrote: > > * Vasant Hegde <hegdevasant@linux.vnet.ibm.com> [2015-03-09 11:56:53]: > > > > .../... > > > >> > >> __occ_do_load(scope, dbob_id, seq_id); > > > > This will get called from occ_poke_load_queue() after lid preload is > > complete. > > > > In addition to this fix, > > > > hservice_lid_load() > > if (list_empty(&hbrt_lid_list)) /* Should not happen */ > > hservices_lid_preload(); <<, Change to abort() > > Vaidy, > > So you don't want to try loading LID again here? No, this situation should never happen since you ensure that the lid would have been preloaded before calling __occ_do_load(). > Also better return error instead of abort? No, this will return error into host services and that will be hard to debug. If pre-load failed despite attempts to queue and callback host service after preload completes, then our code is broken and we need to abort and fix it. If we continue to load lid at this time, then we are back to square-one where other load request for capi or initrd can come inbetween and get stuck on running poller with lock held. --Vaidy
diff --git a/hw/occ.c b/hw/occ.c index cc2d4b7..34d6de5 100644 --- a/hw/occ.c +++ b/hw/occ.c @@ -458,8 +458,10 @@ static void occ_do_load(u8 scope, u32 dbob_id __unused, u32 seq_id) * Check if hostservices lid caching is complete. If not, queue * the load request. */ - if (!hservices_lid_preload_complete()) + if (!hservices_lid_preload_complete()) { occ_queue_load(scope, dbob_id, seq_id); + return; + } __occ_do_load(scope, dbob_id, seq_id); }
Commit 4db0c1e4f introduced occ_load_req queue. With that changes we queue the occ load request if hostservice LID load is not complete. And we have callback function (occ_poke_load_queue)...which takes care of calling __occ_do_load(). But current code proceeds and calls __occ_do_load() after queueing....which is not correct. So just return if we queue the occ load request. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> --- hw/occ.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)