diff mbox series

[v3,1/2] lib: sbi_init: Avoid thundering hurd problem with coldboot_lock

Message ID 20200818040728.528426-1-anup.patel@wdc.com
State Accepted
Headers show
Series [v3,1/2] lib: sbi_init: Avoid thundering hurd problem with coldboot_lock | expand

Commit Message

Anup Patel Aug. 18, 2020, 4:07 a.m. UTC
We can have thundering hurd problem with coldboot_lock where the
boot HART can potentially starve trying to acquire coldboot_lock
because some of the non-boot HARTs are continuously acquiring and
releasing coldboot_lock. This can happen if MIP.MSIP bit is already
set for some of the non-boot HARTs.

To avoid thundering hurd problem for coldboot_lock, we use the
__smp_load_acquire() and __smp_store_release() for coldboot_done
flag and use coldboot_lock only for coldboot_wait_hmask.

Signed-off-by: Anup Patel <anup.patel@wdc.com>
Tested-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Reviewed-by: Atish Patra <atish.patra@wdc.com>
---
 lib/sbi/sbi_init.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

Comments

Bin Meng Aug. 18, 2020, 9:16 a.m. UTC | #1
On Tue, Aug 18, 2020 at 12:09 PM Anup Patel <anup.patel@wdc.com> wrote:
>
> We can have thundering hurd problem with coldboot_lock where the
> boot HART can potentially starve trying to acquire coldboot_lock
> because some of the non-boot HARTs are continuously acquiring and
> releasing coldboot_lock. This can happen if MIP.MSIP bit is already
> set for some of the non-boot HARTs.
>
> To avoid thundering hurd problem for coldboot_lock, we use the
> __smp_load_acquire() and __smp_store_release() for coldboot_done
> flag and use coldboot_lock only for coldboot_wait_hmask.
>
> Signed-off-by: Anup Patel <anup.patel@wdc.com>
> Tested-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
> Reviewed-by: Atish Patra <atish.patra@wdc.com>
> ---
>  lib/sbi/sbi_init.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
>

Reviewed-by: Bin Meng <bin.meng@windriver.com>
Sean Anderson Aug. 19, 2020, 12:03 p.m. UTC | #2
On 8/18/20 12:07 AM, Anup Patel wrote:
> We can have thundering hurd problem with coldboot_lock where the
> boot HART can potentially starve trying to acquire coldboot_lock
> because some of the non-boot HARTs are continuously acquiring and
> releasing coldboot_lock. This can happen if MIP.MSIP bit is already
> set for some of the non-boot HARTs.
> 
> To avoid thundering hurd problem for coldboot_lock, we use the
> __smp_load_acquire() and __smp_store_release() for coldboot_done
> flag and use coldboot_lock only for coldboot_wait_hmask.
> 
> Signed-off-by: Anup Patel <anup.patel@wdc.com>
> Tested-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
> Reviewed-by: Atish Patra <atish.patra@wdc.com>

Just curious, what are the changes from v1 and v2?

--Sean
Heinrich Schuchardt Aug. 19, 2020, 12:08 p.m. UTC | #3
On 19.08.20 14:03, Sean Anderson wrote:
> On 8/18/20 12:07 AM, Anup Patel wrote:
>> We can have thundering hurd problem with coldboot_lock where the
>> boot HART can potentially starve trying to acquire coldboot_lock
>> because some of the non-boot HARTs are continuously acquiring and
>> releasing coldboot_lock. This can happen if MIP.MSIP bit is already
>> set for some of the non-boot HARTs.
>>
>> To avoid thundering hurd problem for coldboot_lock, we use the
>> __smp_load_acquire() and __smp_store_release() for coldboot_done
>> flag and use coldboot_lock only for coldboot_wait_hmask.
>>
>> Signed-off-by: Anup Patel <anup.patel@wdc.com>
>> Tested-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
>> Reviewed-by: Atish Patra <atish.patra@wdc.com>
>
> Just curious, what are the changes from v1 and v2?

Barriers are realized in a different way in patch 1.

Except for the barriers patch 1 follows what I proposed in my original
patch and the following discussion.

And for sure Jessica was right that at least compiler barriers are needed.

Best regards

Heinrich
Anup Patel Aug. 20, 2020, 11:43 a.m. UTC | #4
> -----Original Message-----
> From: Heinrich Schuchardt <xypron.glpk@gmx.de>
> Sent: 19 August 2020 17:39
> To: Sean Anderson <seanga2@gmail.com>; Anup Patel
> <Anup.Patel@wdc.com>; Atish Patra <Atish.Patra@wdc.com>; Alistair
> Francis <Alistair.Francis@wdc.com>
> Cc: opensbi@lists.infradead.org; Anup Patel <anup@brainfault.org>
> Subject: Re: [PATCH v3 1/2] lib: sbi_init: Avoid thundering hurd problem with
> coldboot_lock
> 
> On 19.08.20 14:03, Sean Anderson wrote:
> > On 8/18/20 12:07 AM, Anup Patel wrote:
> >> We can have thundering hurd problem with coldboot_lock where the
> boot
> >> HART can potentially starve trying to acquire coldboot_lock because
> >> some of the non-boot HARTs are continuously acquiring and releasing
> >> coldboot_lock. This can happen if MIP.MSIP bit is already set for
> >> some of the non-boot HARTs.
> >>
> >> To avoid thundering hurd problem for coldboot_lock, we use the
> >> __smp_load_acquire() and __smp_store_release() for coldboot_done
> flag
> >> and use coldboot_lock only for coldboot_wait_hmask.
> >>
> >> Signed-off-by: Anup Patel <anup.patel@wdc.com>
> >> Tested-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
> >> Reviewed-by: Atish Patra <atish.patra@wdc.com>
> >
> > Just curious, what are the changes from v1 and v2?
> 
> Barriers are realized in a different way in patch 1.
> 
> Except for the barriers patch 1 follows what I proposed in my original patch
> and the following discussion.
> 
> And for sure Jessica was right that at least compiler barriers are needed.

Yes, v1->v2 is replacing use of atomic_read/write() with__smp_load/store_xyz().

v2->v3 is only a typo-fix in PATCH1 subject.

Regards,
Anup
Anup Patel Aug. 22, 2020, 3:51 a.m. UTC | #5
> -----Original Message-----
> From: Bin Meng <bmeng.cn@gmail.com>
> Sent: 18 August 2020 14:46
> To: Anup Patel <Anup.Patel@wdc.com>
> Cc: Atish Patra <Atish.Patra@wdc.com>; Alistair Francis
> <Alistair.Francis@wdc.com>; Heinrich Schuchardt <xypron.glpk@gmx.de>;
> OpenSBI <opensbi@lists.infradead.org>; Anup Patel <anup@brainfault.org>
> Subject: Re: [PATCH v3 1/2] lib: sbi_init: Avoid thundering hurd problem with
> coldboot_lock
> 
> On Tue, Aug 18, 2020 at 12:09 PM Anup Patel <anup.patel@wdc.com> wrote:
> >
> > We can have thundering hurd problem with coldboot_lock where the boot
> > HART can potentially starve trying to acquire coldboot_lock because
> > some of the non-boot HARTs are continuously acquiring and releasing
> > coldboot_lock. This can happen if MIP.MSIP bit is already set for some
> > of the non-boot HARTs.
> >
> > To avoid thundering hurd problem for coldboot_lock, we use the
> > __smp_load_acquire() and __smp_store_release() for coldboot_done flag
> > and use coldboot_lock only for coldboot_wait_hmask.
> >
> > Signed-off-by: Anup Patel <anup.patel@wdc.com>
> > Tested-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
> > Reviewed-by: Atish Patra <atish.patra@wdc.com>
> > ---
> >  lib/sbi/sbi_init.c | 20 +++++++++++++-------
> >  1 file changed, 13 insertions(+), 7 deletions(-)
> >
> 
> Reviewed-by: Bin Meng <bin.meng@windriver.com>

Applied this patch to the riscv/opensbi repo

Regards,
Anup
diff mbox series

Patch

diff --git a/lib/sbi/sbi_init.c b/lib/sbi/sbi_init.c
index a7fb848..c438eaa 100644
--- a/lib/sbi/sbi_init.c
+++ b/lib/sbi/sbi_init.c
@@ -9,6 +9,7 @@ 
 
 #include <sbi/riscv_asm.h>
 #include <sbi/riscv_atomic.h>
+#include <sbi/riscv_barrier.h>
 #include <sbi/riscv_locks.h>
 #include <sbi/sbi_console.h>
 #include <sbi/sbi_ecall.h>
@@ -85,9 +86,10 @@  static void sbi_boot_prints(struct sbi_scratch *scratch, u32 hartid)
 }
 
 static spinlock_t coldboot_lock = SPIN_LOCK_INITIALIZER;
-static unsigned long coldboot_done = 0;
 static struct sbi_hartmask coldboot_wait_hmask = { 0 };
 
+static unsigned long coldboot_done;
+
 static void wait_for_coldboot(struct sbi_scratch *scratch, u32 hartid)
 {
 	unsigned long saved_mie, cmip;
@@ -105,16 +107,20 @@  static void wait_for_coldboot(struct sbi_scratch *scratch, u32 hartid)
 	/* Mark current HART as waiting */
 	sbi_hartmask_set_hart(hartid, &coldboot_wait_hmask);
 
+	/* Release coldboot lock */
+	spin_unlock(&coldboot_lock);
+
 	/* Wait for coldboot to finish using WFI */
-	while (!coldboot_done) {
-		spin_unlock(&coldboot_lock);
+	while (!__smp_load_acquire(&coldboot_done)) {
 		do {
 			wfi();
 			cmip = csr_read(CSR_MIP);
 		 } while (!(cmip & MIP_MSIP));
-		spin_lock(&coldboot_lock);
 	};
 
+	/* Acquire coldboot lock */
+	spin_lock(&coldboot_lock);
+
 	/* Unmark current HART as waiting */
 	sbi_hartmask_clear_hart(hartid, &coldboot_wait_hmask);
 
@@ -132,12 +138,12 @@  static void wake_coldboot_harts(struct sbi_scratch *scratch, u32 hartid)
 {
 	const struct sbi_platform *plat = sbi_platform_ptr(scratch);
 
+	/* Mark coldboot done */
+	__smp_store_release(&coldboot_done, 1);
+
 	/* Acquire coldboot lock */
 	spin_lock(&coldboot_lock);
 
-	/* Mark coldboot done */
-	coldboot_done = 1;
-
 	/* Send an IPI to all HARTs waiting for coldboot */
 	for (int i = 0; i <= sbi_scratch_last_hartid(); i++) {
 		if ((i != hartid) &&