Message ID | 1472797976-24210-4-git-send-email-nikunj@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
On Fri, Sep 02, 2016 at 12:02:55PM +0530, Nikunj A Dadhania wrote: > Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> This really needs a comment indicating that this implementation isn't strictly correct (although probably good enough in practice). Specifically a racing store which happens to store the same value which was already in memory should clobber the reservation, but won't with this implementation. I had a long discussion at KVM Forum with Emilio Costa about this, in which I discovered just how hard it is to strictly implement store-conditional semantics in terms of anything else. So, this is probably a reasonable substitute, but we should note the fact that it's not 100%. > --- > target-ppc/translate.c | 24 +++++++++++++++++++++--- > 1 file changed, 21 insertions(+), 3 deletions(-) > > diff --git a/target-ppc/translate.c b/target-ppc/translate.c > index 4a882b3..447c13e 100644 > --- a/target-ppc/translate.c > +++ b/target-ppc/translate.c > @@ -72,6 +72,7 @@ static TCGv cpu_cfar; > #endif > static TCGv cpu_xer, cpu_so, cpu_ov, cpu_ca; > static TCGv cpu_reserve; > +static TCGv cpu_reserve_val; > static TCGv cpu_fpscr; > static TCGv_i32 cpu_access_type; > > @@ -176,6 +177,9 @@ void ppc_translate_init(void) > cpu_reserve = tcg_global_mem_new(cpu_env, > offsetof(CPUPPCState, reserve_addr), > "reserve_addr"); > + cpu_reserve_val = tcg_global_mem_new(cpu_env, > + offsetof(CPUPPCState, reserve_val), > + "reserve_val"); > > cpu_fpscr = tcg_global_mem_new(cpu_env, > offsetof(CPUPPCState, fpscr), "fpscr"); > @@ -3086,7 +3090,7 @@ static void gen_##name(DisasContext *ctx) \ > } \ > tcg_gen_qemu_ld_tl(gpr, t0, ctx->mem_idx, memop); \ > tcg_gen_mov_tl(cpu_reserve, t0); \ > - tcg_gen_st_tl(gpr, cpu_env, offsetof(CPUPPCState, reserve_val)); \ > + tcg_gen_mov_tl(cpu_reserve_val, gpr); \ > tcg_temp_free(t0); \ > } > > @@ -3112,14 +3116,28 @@ static void gen_conditional_store(DisasContext *ctx, TCGv EA, > int reg, int memop) > { > TCGLabel *l1; > + TCGv_i32 tmp = tcg_temp_local_new_i32(); > + TCGv t0; > > + tcg_gen_movi_i32(tmp, 0); > tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so); > l1 = gen_new_label(); > tcg_gen_brcond_tl(TCG_COND_NE, EA, cpu_reserve, l1); > - tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 1 << CRF_EQ); > - tcg_gen_qemu_st_tl(cpu_gpr[reg], EA, ctx->mem_idx, memop); > + > + t0 = tcg_temp_new(); > + tcg_gen_atomic_cmpxchg_tl(t0, EA, cpu_reserve_val, cpu_gpr[reg], > + ctx->mem_idx, DEF_MEMOP(memop)); > + tcg_gen_setcond_tl(TCG_COND_EQ, t0, t0, cpu_reserve_val); > + tcg_gen_trunc_tl_i32(tmp, t0); > + > gen_set_label(l1); > + tcg_gen_shli_i32(tmp, tmp, CRF_EQ); > + tcg_gen_or_i32(cpu_crf[0], cpu_crf[0], tmp); > tcg_gen_movi_tl(cpu_reserve, -1); > + tcg_gen_movi_tl(cpu_reserve_val, 0); > + > + tcg_temp_free(t0); > + tcg_temp_free_i32(tmp); > } > #endif >
David Gibson <david@gibson.dropbear.id.au> writes: > [ Unknown signature status ] > On Fri, Sep 02, 2016 at 12:02:55PM +0530, Nikunj A Dadhania wrote: >> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> > > This really needs a comment indicating that this implementation isn't > strictly correct (although probably good enough in practice). Sure. And it also does not help if someone uses any store other than store conditional, that isn't taken care. Assumption here is the locking primitives use load with reservation and store conditional. And no other ld/st variant touch this memory. > Specifically a racing store which happens to store the same value > which was already in memory should clobber the reservation, but won't > with this implementation. > > I had a long discussion at KVM Forum with Emilio Costa about this, in > which I discovered just how hard it is to strictly implement > store-conditional semantics in terms of anything else. So, this is > probably a reasonable substitute, but we should note the fact that > it's not 100%. I will update the commit log. Regards, Nikunj
On Wed, 2016-09-07 at 10:17 +0530, Nikunj A Dadhania wrote: > > David Gibson <david@gibson.dropbear.id.au> writes: > > > > > [ Unknown signature status ] > > On Fri, Sep 02, 2016 at 12:02:55PM +0530, Nikunj A Dadhania wrote: > > > > > > > > > Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> > > > > This really needs a comment indicating that this implementation isn't > > strictly correct (although probably good enough in practice). > > Sure. And it also does not help if someone uses any store other than > store conditional, that isn't taken care. > > Assumption here is the locking primitives use load with reservation and > store conditional. And no other ld/st variant touch this memory. This is an incorrect assumption. spin_unlock for example uses a normal store. That being said, you will observe the difference in value which should hopefully make things work... I *hope* we don't have anything that relies on a normal store of the same value as the atomic breaking the reservation, I *think* we might get away with it, but it is indeed fishy. > > Specifically a racing store which happens to store the same value > > which was already in memory should clobber the reservation, but won't > > with this implementation. > > > > I had a long discussion at KVM Forum with Emilio Costa about this, in > > which I discovered just how hard it is to strictly implement > > store-conditional semantics in terms of anything else. So, this is > > probably a reasonable substitute, but we should note the fact that > > it's not 100%. > > I will update the commit log. > > Regards, > Nikunj
On Wed, Sep 07, 2016 at 10:17:42AM +0530, Nikunj A Dadhania wrote: > David Gibson <david@gibson.dropbear.id.au> writes: > > > [ Unknown signature status ] > > On Fri, Sep 02, 2016 at 12:02:55PM +0530, Nikunj A Dadhania wrote: > >> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> > > > > This really needs a comment indicating that this implementation isn't > > strictly correct (although probably good enough in practice). > > Sure. And it also does not help if someone uses any store other than > store conditional, that isn't taken care. > > Assumption here is the locking primitives use load with reservation and > store conditional. And no other ld/st variant touch this memory. So, a) I don't think this really relies on that: an ordinary store (assuming it changes the value) will still get picked up the cmpxchg. Well.. at least after a suitable memory barrier. Matching memory models between emulated and host cpus is a whole other kettle of fish. I think this does matter, IIRC a kernel spin unlock on ppc is a barrier + plain store, no load locked or store conditional. > > Specifically a racing store which happens to store the same value > > which was already in memory should clobber the reservation, but won't > > with this implementation. > > > > I had a long discussion at KVM Forum with Emilio Costa about this, in > > which I discovered just how hard it is to strictly implement > > store-conditional semantics in terms of anything else. So, this is > > probably a reasonable substitute, but we should note the fact that > > it's not 100%. > > I will update the commit log. > > Regards, > Nikunj >
David Gibson <david@gibson.dropbear.id.au> writes: > On Wed, Sep 07, 2016 at 10:17:42AM +0530, Nikunj A Dadhania wrote: >> David Gibson <david@gibson.dropbear.id.au> writes: >> >> > [ Unknown signature status ] >> > On Fri, Sep 02, 2016 at 12:02:55PM +0530, Nikunj A Dadhania wrote: >> >> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> >> > >> > This really needs a comment indicating that this implementation isn't >> > strictly correct (although probably good enough in practice). >> >> Sure. And it also does not help if someone uses any store other than >> store conditional, that isn't taken care. >> >> Assumption here is the locking primitives use load with reservation and >> store conditional. And no other ld/st variant touch this memory. > > So, a) I don't think this really relies on that: an ordinary store > (assuming it changes the value) will still get picked up the cmpxchg. > Well.. at least after a suitable memory barrier. Matching memory > models between emulated and host cpus is a whole other kettle of fish. Have you seen Pranith's memory barrier patches? > > I think this does matter, IIRC a kernel spin unlock on ppc is a > barrier + plain store, no load locked or store conditional. > >> > Specifically a racing store which happens to store the same value >> > which was already in memory should clobber the reservation, but won't >> > with this implementation. >> > >> > I had a long discussion at KVM Forum with Emilio Costa about this, in >> > which I discovered just how hard it is to strictly implement >> > store-conditional semantics in terms of anything else. So, this is >> > probably a reasonable substitute, but we should note the fact that >> > it's not 100%. >> >> I will update the commit log. >> >> Regards, >> Nikunj >> -- Alex Bennée
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes: > On Wed, 2016-09-07 at 10:17 +0530, Nikunj A Dadhania wrote: >> > David Gibson <david@gibson.dropbear.id.au> writes: >> >> > >> > [ Unknown signature status ] >> > On Fri, Sep 02, 2016 at 12:02:55PM +0530, Nikunj A Dadhania wrote: >> > > >> > > > > > Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> >> > >> > This really needs a comment indicating that this implementation isn't >> > strictly correct (although probably good enough in practice). >> >> Sure. And it also does not help if someone uses any store other than >> store conditional, that isn't taken care. >> >> Assumption here is the locking primitives use load with reservation and >> store conditional. And no other ld/st variant touch this memory. > > This is an incorrect assumption. spin_unlock for example uses a normal > store. > > That being said, you will observe the difference in value which should > hopefully make things work... > > I *hope* we don't have anything that relies on a normal store of the same > value as the atomic breaking the reservation, I *think* we might get away > with it, but it is indeed fishy. In arch/powerpc/include/asm/spinlock.h: We have __arch_spin_trylock() which uses lwarx and "stwcx." instructions to acquire lock. And later during the unlock, a normal store will do. IMHO, that is the reason for it work. Regards, Nikunj
On Wed, Sep 07, 2016 at 08:13:31AM +0100, Alex Bennée wrote: > > David Gibson <david@gibson.dropbear.id.au> writes: > > > On Wed, Sep 07, 2016 at 10:17:42AM +0530, Nikunj A Dadhania wrote: > >> David Gibson <david@gibson.dropbear.id.au> writes: > >> > >> > [ Unknown signature status ] > >> > On Fri, Sep 02, 2016 at 12:02:55PM +0530, Nikunj A Dadhania wrote: > >> >> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> > >> > > >> > This really needs a comment indicating that this implementation isn't > >> > strictly correct (although probably good enough in practice). > >> > >> Sure. And it also does not help if someone uses any store other than > >> store conditional, that isn't taken care. > >> > >> Assumption here is the locking primitives use load with reservation and > >> store conditional. And no other ld/st variant touch this memory. > > > > So, a) I don't think this really relies on that: an ordinary store > > (assuming it changes the value) will still get picked up the cmpxchg. > > Well.. at least after a suitable memory barrier. Matching memory > > models between emulated and host cpus is a whole other kettle of fish. > > Have you seen Pranith's memory barrier patches? I have not. > > > > > I think this does matter, IIRC a kernel spin unlock on ppc is a > > barrier + plain store, no load locked or store conditional. > > > >> > Specifically a racing store which happens to store the same value > >> > which was already in memory should clobber the reservation, but won't > >> > with this implementation. > >> > > >> > I had a long discussion at KVM Forum with Emilio Costa about this, in > >> > which I discovered just how hard it is to strictly implement > >> > store-conditional semantics in terms of anything else. So, this is > >> > probably a reasonable substitute, but we should note the fact that > >> > it's not 100%. > >> > >> I will update the commit log. > >> > >> Regards, > >> Nikunj > >> > >
David Gibson <david@gibson.dropbear.id.au> writes: > On Wed, Sep 07, 2016 at 08:13:31AM +0100, Alex Bennée wrote: >> >> David Gibson <david@gibson.dropbear.id.au> writes: >> >> > On Wed, Sep 07, 2016 at 10:17:42AM +0530, Nikunj A Dadhania wrote: >> >> David Gibson <david@gibson.dropbear.id.au> writes: >> >> >> >> > [ Unknown signature status ] >> >> > On Fri, Sep 02, 2016 at 12:02:55PM +0530, Nikunj A Dadhania wrote: >> >> >> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> >> >> > >> >> > This really needs a comment indicating that this implementation isn't >> >> > strictly correct (although probably good enough in practice). >> >> >> >> Sure. And it also does not help if someone uses any store other than >> >> store conditional, that isn't taken care. >> >> >> >> Assumption here is the locking primitives use load with reservation and >> >> store conditional. And no other ld/st variant touch this memory. >> > >> > So, a) I don't think this really relies on that: an ordinary store >> > (assuming it changes the value) will still get picked up the cmpxchg. >> > Well.. at least after a suitable memory barrier. Matching memory >> > models between emulated and host cpus is a whole other kettle of fish. >> >> Have you seen Pranith's memory barrier patches? > > I have not. They are now in Richard's tcg-next queue Message-Id: <1473282648-23487-1-git-send-email-rth@twiddle.net> Subject: [Qemu-devel] [PULL 00/18] tcg queued patches All the backends support the new fence op, so far only ARM, Alpha and x86 emit the fence TCGOps as these are best added by someone who understands the frontend well. >> > >> > I think this does matter, IIRC a kernel spin unlock on ppc is a >> > barrier + plain store, no load locked or store conditional. >> > >> >> > Specifically a racing store which happens to store the same value >> >> > which was already in memory should clobber the reservation, but won't >> >> > with this implementation. >> >> > >> >> > I had a long discussion at KVM Forum with Emilio Costa about this, in >> >> > which I discovered just how hard it is to strictly implement >> >> > store-conditional semantics in terms of anything else. So, this is >> >> > probably a reasonable substitute, but we should note the fact that >> >> > it's not 100%. >> >> >> >> I will update the commit log. >> >> >> >> Regards, >> >> Nikunj >> >> >> >> -- Alex Bennée
On Mon, 2016-09-12 at 09:39 +0100, Alex Bennée wrote: > > They are now in Richard's tcg-next queue > > Message-Id: <1473282648-23487-1-git-send-email-rth@twiddle.net> > Subject: [Qemu-devel] [PULL 00/18] tcg queued patches > > All the backends support the new fence op, so far only ARM, Alpha and > x86 emit the fence TCGOps as these are best added by someone who > understands the frontend well. Hrm, I should probably have a look ;-) A bit swamped this week, I'll see what I can do. Cheers, Ben. > > > > > > > > > > > > > > > > > I think this does matter, IIRC a kernel spin unlock on ppc is a > > > > barrier + plain store, no load locked or store conditional. > > > > > > > > > > > > > > > > > > > > > Specifically a racing store which happens to store the same > > > > > > value > > > > > > which was already in memory should clobber the reservation, > > > > > > but won't > > > > > > with this implementation. > > > > > > > > > > > > I had a long discussion at KVM Forum with Emilio Costa > > > > > > about this, in > > > > > > which I discovered just how hard it is to strictly > > > > > > implement > > > > > > store-conditional semantics in terms of anything else. So, > > > > > > this is > > > > > > probably a reasonable substitute, but we should note the > > > > > > fact that > > > > > > it's not 100%. > > > > > > > > > > I will update the commit log. > > > > > > > > > > Regards, > > > > > Nikunj > > > > > > > > > > > > > > -- > Alex Bennée
diff --git a/target-ppc/translate.c b/target-ppc/translate.c index 4a882b3..447c13e 100644 --- a/target-ppc/translate.c +++ b/target-ppc/translate.c @@ -72,6 +72,7 @@ static TCGv cpu_cfar; #endif static TCGv cpu_xer, cpu_so, cpu_ov, cpu_ca; static TCGv cpu_reserve; +static TCGv cpu_reserve_val; static TCGv cpu_fpscr; static TCGv_i32 cpu_access_type; @@ -176,6 +177,9 @@ void ppc_translate_init(void) cpu_reserve = tcg_global_mem_new(cpu_env, offsetof(CPUPPCState, reserve_addr), "reserve_addr"); + cpu_reserve_val = tcg_global_mem_new(cpu_env, + offsetof(CPUPPCState, reserve_val), + "reserve_val"); cpu_fpscr = tcg_global_mem_new(cpu_env, offsetof(CPUPPCState, fpscr), "fpscr"); @@ -3086,7 +3090,7 @@ static void gen_##name(DisasContext *ctx) \ } \ tcg_gen_qemu_ld_tl(gpr, t0, ctx->mem_idx, memop); \ tcg_gen_mov_tl(cpu_reserve, t0); \ - tcg_gen_st_tl(gpr, cpu_env, offsetof(CPUPPCState, reserve_val)); \ + tcg_gen_mov_tl(cpu_reserve_val, gpr); \ tcg_temp_free(t0); \ } @@ -3112,14 +3116,28 @@ static void gen_conditional_store(DisasContext *ctx, TCGv EA, int reg, int memop) { TCGLabel *l1; + TCGv_i32 tmp = tcg_temp_local_new_i32(); + TCGv t0; + tcg_gen_movi_i32(tmp, 0); tcg_gen_trunc_tl_i32(cpu_crf[0], cpu_so); l1 = gen_new_label(); tcg_gen_brcond_tl(TCG_COND_NE, EA, cpu_reserve, l1); - tcg_gen_ori_i32(cpu_crf[0], cpu_crf[0], 1 << CRF_EQ); - tcg_gen_qemu_st_tl(cpu_gpr[reg], EA, ctx->mem_idx, memop); + + t0 = tcg_temp_new(); + tcg_gen_atomic_cmpxchg_tl(t0, EA, cpu_reserve_val, cpu_gpr[reg], + ctx->mem_idx, DEF_MEMOP(memop)); + tcg_gen_setcond_tl(TCG_COND_EQ, t0, t0, cpu_reserve_val); + tcg_gen_trunc_tl_i32(tmp, t0); + gen_set_label(l1); + tcg_gen_shli_i32(tmp, tmp, CRF_EQ); + tcg_gen_or_i32(cpu_crf[0], cpu_crf[0], tmp); tcg_gen_movi_tl(cpu_reserve, -1); + tcg_gen_movi_tl(cpu_reserve_val, 0); + + tcg_temp_free(t0); + tcg_temp_free_i32(tmp); } #endif
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> --- target-ppc/translate.c | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-)