Message ID | 2fbcf76e4ff63d8527edd3662342948276e2cd37.1502474835.git.alistair.francis@xilinx.com |
---|---|
State | New |
Headers | show |
On 08/11/2017 11:19 AM, Alistair Francis wrote: > The exclusive store operation should return 0 if the operation updates > memory and 1 if it doesn't. This means that storing tmp in the rd > register is incorrect. I'm confused as to what you believe is wrong. > tcg_gen_atomic_cmpxchg_i64(tmp, addr, val, tmp, > get_mem_index(s), > - size | MO_ALIGN | s->be_data); > - tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, val); Yes, we load the result of the cmpxchg into tmp, but then we immediately overwrite tmp with 0/1 depending on whether the operation succeeded. > > This patch updates the succesful opertion to store 0 into the rd > register instead of tmp. It also adds a branch to fail if the memory > isn't updated. Since we use setcond, a branch is not required. > + tcg_gen_ext_i64(val, val, memop); What is this addition intended to accomplish? Because of the position within the code, you know that memop contains MO_64, so that this is a no-op. r~
On Fri, Aug 11, 2017 at 12:46 PM, Richard Henderson <richard.henderson@linaro.org> wrote: > On 08/11/2017 11:19 AM, Alistair Francis wrote: >> The exclusive store operation should return 0 if the operation updates >> memory and 1 if it doesn't. This means that storing tmp in the rd >> register is incorrect. > > I'm confused as to what you believe is wrong. > >> tcg_gen_atomic_cmpxchg_i64(tmp, addr, val, tmp, >> get_mem_index(s), >> - size | MO_ALIGN | s->be_data); >> - tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, val); > > Yes, we load the result of the cmpxchg into tmp, but then we immediately > overwrite tmp with 0/1 depending on whether the operation succeeded. Hmmm... When I looked at the values in tmp I was seeing non 0/1 values in there. > >> >> This patch updates the succesful opertion to store 0 into the rd >> register instead of tmp. It also adds a branch to fail if the memory >> isn't updated. > > Since we use setcond, a branch is not required. > >> + tcg_gen_ext_i64(val, val, memop); > > What is this addition intended to accomplish? Because of the position within > the code, you know that memop contains MO_64, so that this is a no-op. This is when size == 2 so it's a 32bit operation so memop contains MO_32. Thanks, Alistair > > > r~
On 08/11/2017 01:13 PM, Alistair Francis wrote: >>> + tcg_gen_ext_i64(val, val, memop); >> >> What is this addition intended to accomplish? Because of the position within >> the code, you know that memop contains MO_64, so that this is a no-op. > > This is when size == 2 so it's a 32bit operation so memop contains MO_32. It's a paired 32-bit operation, so we're operating on a 64-bit quantity. So extending from 32-bits would be actively wrong. r~
On Fri, Aug 11, 2017 at 1:24 PM, Richard Henderson <richard.henderson@linaro.org> wrote: > On 08/11/2017 01:13 PM, Alistair Francis wrote: >>>> + tcg_gen_ext_i64(val, val, memop); >>> >>> What is this addition intended to accomplish? Because of the position within >>> the code, you know that memop contains MO_64, so that this is a no-op. >> >> This is when size == 2 so it's a 32bit operation so memop contains MO_32. > > It's a paired 32-bit operation, so we're operating on a 64-bit quantity. So > extending from 32-bits would be actively wrong. From what I can see though, the 32bit memop is carried into the tcg_gen_atomic_cmpxchg_i64() call so the value returned to tmp is masked by the 32bit operation. Is passing down MO_32 into tcg_gen_atomic_cmpxchg_i64() wrong then as it ends up as a 64-bit operation? My TCG knowledge is pretty limited here. Thanks, Alistair > > > r~ >
On 08/11/2017 01:29 PM, Alistair Francis wrote: > On Fri, Aug 11, 2017 at 1:24 PM, Richard Henderson > <richard.henderson@linaro.org> wrote: >> On 08/11/2017 01:13 PM, Alistair Francis wrote: >>>>> + tcg_gen_ext_i64(val, val, memop); >>>> >>>> What is this addition intended to accomplish? Because of the position within >>>> the code, you know that memop contains MO_64, so that this is a no-op. >>> >>> This is when size == 2 so it's a 32bit operation so memop contains MO_32. >> >> It's a paired 32-bit operation, so we're operating on a 64-bit quantity. So >> extending from 32-bits would be actively wrong. > > From what I can see though, the 32bit memop is carried into the > tcg_gen_atomic_cmpxchg_i64() call so the value returned to tmp is > masked by the 32bit operation. > > Is passing down MO_32 into tcg_gen_atomic_cmpxchg_i64() wrong then as > it ends up as a 64-bit operation? If we're passing MO_32 down to cmpxchg_i64 for this case, you have indeed found a bug. I'll investigate this further on Monday. r~
On Fri, Aug 11, 2017 at 1:38 PM, Richard Henderson <richard.henderson@linaro.org> wrote: > On 08/11/2017 01:29 PM, Alistair Francis wrote: >> On Fri, Aug 11, 2017 at 1:24 PM, Richard Henderson >> <richard.henderson@linaro.org> wrote: >>> On 08/11/2017 01:13 PM, Alistair Francis wrote: >>>>>> + tcg_gen_ext_i64(val, val, memop); >>>>> >>>>> What is this addition intended to accomplish? Because of the position within >>>>> the code, you know that memop contains MO_64, so that this is a no-op. >>>> >>>> This is when size == 2 so it's a 32bit operation so memop contains MO_32. >>> >>> It's a paired 32-bit operation, so we're operating on a 64-bit quantity. So >>> extending from 32-bits would be actively wrong. >> >> From what I can see though, the 32bit memop is carried into the >> tcg_gen_atomic_cmpxchg_i64() call so the value returned to tmp is >> masked by the 32bit operation. >> >> Is passing down MO_32 into tcg_gen_atomic_cmpxchg_i64() wrong then as >> it ends up as a 64-bit operation? > > If we're passing MO_32 down to cmpxchg_i64 for this case, you have indeed found > a bug. I'll investigate this further on Monday. Maybe that is why I'm seeing these failures. I'll have a look as well to see if this fixes my problems. Thanks, Alistair > > > r~ >
On Fri, Aug 11, 2017 at 1:39 PM, Alistair Francis <alistair.francis@xilinx.com> wrote: > On Fri, Aug 11, 2017 at 1:38 PM, Richard Henderson > <richard.henderson@linaro.org> wrote: >> On 08/11/2017 01:29 PM, Alistair Francis wrote: >>> On Fri, Aug 11, 2017 at 1:24 PM, Richard Henderson >>> <richard.henderson@linaro.org> wrote: >>>> On 08/11/2017 01:13 PM, Alistair Francis wrote: >>>>>>> + tcg_gen_ext_i64(val, val, memop); >>>>>> >>>>>> What is this addition intended to accomplish? Because of the position within >>>>>> the code, you know that memop contains MO_64, so that this is a no-op. >>>>> >>>>> This is when size == 2 so it's a 32bit operation so memop contains MO_32. >>>> >>>> It's a paired 32-bit operation, so we're operating on a 64-bit quantity. So >>>> extending from 32-bits would be actively wrong. >>> >>> From what I can see though, the 32bit memop is carried into the >>> tcg_gen_atomic_cmpxchg_i64() call so the value returned to tmp is >>> masked by the 32bit operation. >>> >>> Is passing down MO_32 into tcg_gen_atomic_cmpxchg_i64() wrong then as >>> it ends up as a 64-bit operation? >> >> If we're passing MO_32 down to cmpxchg_i64 for this case, you have indeed found >> a bug. I'll investigate this further on Monday. > > Maybe that is why I'm seeing these failures. I'll have a look as well > to see if this fixes my problems. That's it. That wrong mask was causing all my breakages. I'll send out a new series, thanks for your help. Thanks, Alistair > > Thanks, > Alistair > >> >> >> r~ >>
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 245175e2f1..ea7c61bc6f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -1894,10 +1894,11 @@ static void gen_store_exclusive(DisasContext *s, int rd, int rt, int rt2, * } * env->exclusive_addr = -1; */ + TCGMemOp memop = size | MO_ALIGN | s->be_data; TCGLabel *fail_label = gen_new_label(); TCGLabel *done_label = gen_new_label(); TCGv_i64 addr = tcg_temp_local_new_i64(); - TCGv_i64 tmp; + TCGv_i64 tmp, val; /* Copy input into a local temp so it is not trashed when the * basic block ends at the branch insn. @@ -1907,15 +1908,15 @@ static void gen_store_exclusive(DisasContext *s, int rd, int rt, int rt2, tmp = tcg_temp_new_i64(); if (is_pair) { + val = tcg_temp_new_i64(); if (size == 2) { - TCGv_i64 val = tcg_temp_new_i64(); tcg_gen_concat32_i64(tmp, cpu_reg(s, rt), cpu_reg(s, rt2)); tcg_gen_concat32_i64(val, cpu_exclusive_val, cpu_exclusive_high); tcg_gen_atomic_cmpxchg_i64(tmp, addr, val, tmp, get_mem_index(s), - size | MO_ALIGN | s->be_data); - tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, val); - tcg_temp_free_i64(val); + memop); + tcg_gen_ext_i64(val, val, memop); + tcg_gen_brcond_i64(TCG_COND_NE, tmp, val, fail_label); } else if (s->be_data == MO_LE) { gen_helper_paired_cmpxchg64_le(tmp, cpu_env, addr, cpu_reg(s, rt), cpu_reg(s, rt2)); @@ -1924,22 +1925,23 @@ static void gen_store_exclusive(DisasContext *s, int rd, int rt, int rt2, cpu_reg(s, rt2)); } } else { - TCGv_i64 val = cpu_reg(s, rt); + val = cpu_reg(s, rt); tcg_gen_atomic_cmpxchg_i64(tmp, addr, cpu_exclusive_val, val, get_mem_index(s), - size | MO_ALIGN | s->be_data); - tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, cpu_exclusive_val); + memop); + tcg_gen_brcond_i64(TCG_COND_NE, tmp, cpu_exclusive_val, fail_label); } tcg_temp_free_i64(addr); - tcg_gen_mov_i64(cpu_reg(s, rd), tmp); - tcg_temp_free_i64(tmp); + tcg_gen_movi_i64(cpu_reg(s, rd), 0); tcg_gen_br(done_label); gen_set_label(fail_label); tcg_gen_movi_i64(cpu_reg(s, rd), 1); gen_set_label(done_label); + tcg_temp_free_i64(tmp); + tcg_temp_free_i64(val); tcg_gen_movi_i64(cpu_exclusive_addr, -1); }
The exclusive store operation should return 0 if the operation updates memory and 1 if it doesn't. This means that storing tmp in the rd register is incorrect. This patch updates the succesful opertion to store 0 into the rd register instead of tmp. It also adds a branch to fail if the memory isn't updated. In order to add a branch for the pair case when size equals 2 we first need to apply the same memory operation on the exclusive value in order for the comparison to work. There is still no value checks added if we are doing a 64-bit store with pairs. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> --- This was caught with an internal fuzzy tester. These patches fix the Xilinx 2.10-rc2 tree. I tested with the fuzzy tester (single CPU) and Linux boot (4 CPUs) on the Xilinx tree. I don't have a good test case to run on mainline at the moment, but I'm working on getting one. Also linux-user is fully untested. All tests were with MTTCG enabled. target/arm/translate-a64.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)