Message ID | 1361779487-9274-1-git-send-email-peter.crosthwaite@xilinx.com |
---|---|
State | New |
Headers | show |
On 2013-02-25 00:04, Peter Crosthwaite wrote: > commits 49b4c31efcce45ab714f286f14fa5d5173f9069d and > 2de68a4900ef6eb67380b0c128abfe1976bc66e8 reworked the implementation of adc_CC > and sub_CC. The new implementations (on the TCG_TARGET_HAS_add2_i32 code path) > are incorrect. The new logic is: > > CF:NF = 0:A +/- 0:CF > CF:NF = CF:A +/- 0:B > > The lower 32 bits of the intermediate result stored in NF needs to be passes > into the second addition in place of A (s/CF:A/CF:NF): > > CF:NF = 0:A +/- 0:CF > CF:NF = CF:NF +/- 0:B > > This patch fixes the issue. > > Signed-off-by: Peter Crosthwaite<peter.crosthwaite@xilinx.com> > --- > target-arm/translate.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) Reviewed-by: Richard Henderson <rth@twiddle.net> Sorry for the breakage. Blue, please apply asap. r~
On Mon, Feb 25, 2013 at 3:43 PM, Richard Henderson <rth@twiddle.net> wrote: > On 2013-02-25 00:04, Peter Crosthwaite wrote: >> >> commits 49b4c31efcce45ab714f286f14fa5d5173f9069d and >> 2de68a4900ef6eb67380b0c128abfe1976bc66e8 reworked the implementation of >> adc_CC >> and sub_CC. The new implementations (on the TCG_TARGET_HAS_add2_i32 code >> path) >> are incorrect. The new logic is: >> >> CF:NF = 0:A +/- 0:CF >> CF:NF = CF:A +/- 0:B >> >> The lower 32 bits of the intermediate result stored in NF needs to be >> passes >> into the second addition in place of A (s/CF:A/CF:NF): >> >> CF:NF = 0:A +/- 0:CF >> CF:NF = CF:NF +/- 0:B >> >> This patch fixes the issue. >> >> Signed-off-by: Peter Crosthwaite<peter.crosthwaite@xilinx.com> >> --- >> target-arm/translate.c | 4 ++-- >> 1 files changed, 2 insertions(+), 2 deletions(-) > > > Reviewed-by: Richard Henderson <rth@twiddle.net> > > Sorry for the breakage. Blue, please apply asap. I'm afraid this fix is not enough as I still can't get my Linux image to boot after applying it. Running this, my image boots: git checkout 49b4c31efcce45ab714f286f14fa5d5173f9069d target-arm Looking at the new sbc_cc, it seems that if t0=t1 and CF=1, then CF will be cleared while the old code in the helper did set it. Laurent PS: My image is the vexpress found here: http://releases.linaro.org/images/linaro-n/alip/11.08
On 02/25/2013 09:15 AM, Laurent Desnogues wrote: > Looking at the new sbc_cc, it seems that if t0=t1 and CF=1, > then CF will be cleared while the old code in the helper did > set it. Sigh, yes I see it. While the transform x + ~y + cf -> x - y + cf - 1 works for the low 32-bits, it gets the carry wrong: x - x + (1-1) = 0 whereas x + ~x + 1 = 0xffffffff + 1 = 0x1_00000000 I need to just do exactly what the spec says, not "optimize" it. r~
On 25 February 2013 17:15, Laurent Desnogues
<laurent.desnogues@gmail.com> wrote:
> I'm afraid this fix is not enough
I've confirmed with my risu test tool that sbcs/adcs are indeed
broken even with this patch applied. Patterns used:
ADC_imm A1 cond:4 0010101 s:1 rn:4 rd:4 imm:12
ADC_reg A1 cond:4 0000101 s:1 rn:4 rd:4 imm:5 type:2 0 rm:4
ADC_rsr A1 cond:4 0000101 s:1 rn:4 rd:4 rs:4 0 type:2 1 rm:4
SBC_imm A1 cond:4 0010110 s:1 rn:4 rd:4 imm:12
SBC_reg A1 cond:4 0000110 s:1 rn:4 rd:4 imm:5 type:2 0 rm:4
SBC_rsr A1 cond:4 0000110 s:1 rn:4 rd:4 rs:4 0 type:2 1 rm:4
I really must get round to implementing the record-n-replay
functionality so you can run this kind of test without ARM
hardware to hand.
-- PMM
diff --git a/target-arm/translate.c b/target-arm/translate.c index 9993aea..6d91b70 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -428,7 +428,7 @@ static void gen_adc_CC(TCGv dest, TCGv t0, TCGv t1) if (TCG_TARGET_HAS_add2_i32) { tcg_gen_movi_i32(tmp, 0); tcg_gen_add2_i32(cpu_NF, cpu_CF, t0, tmp, cpu_CF, tmp); - tcg_gen_add2_i32(cpu_NF, cpu_CF, t0, cpu_CF, t1, tmp); + tcg_gen_add2_i32(cpu_NF, cpu_CF, cpu_NF, cpu_CF, t1, tmp); } else { TCGv_i64 q0 = tcg_temp_new_i64(); TCGv_i64 q1 = tcg_temp_new_i64(); @@ -472,7 +472,7 @@ static void gen_sbc_CC(TCGv dest, TCGv t0, TCGv t1) if (TCG_TARGET_HAS_add2_i32) { tcg_gen_movi_i32(tmp, 0); tcg_gen_add2_i32(cpu_NF, cpu_CF, t0, tmp, cpu_CF, tmp); - tcg_gen_sub2_i32(cpu_NF, cpu_CF, t0, cpu_CF, t1, tmp); + tcg_gen_sub2_i32(cpu_NF, cpu_CF, cpu_NF, cpu_CF, t1, tmp); } else { TCGv_i64 q0 = tcg_temp_new_i64(); TCGv_i64 q1 = tcg_temp_new_i64();
commits 49b4c31efcce45ab714f286f14fa5d5173f9069d and 2de68a4900ef6eb67380b0c128abfe1976bc66e8 reworked the implementation of adc_CC and sub_CC. The new implementations (on the TCG_TARGET_HAS_add2_i32 code path) are incorrect. The new logic is: CF:NF = 0:A +/- 0:CF CF:NF = CF:A +/- 0:B The lower 32 bits of the intermediate result stored in NF needs to be passes into the second addition in place of A (s/CF:A/CF:NF): CF:NF = 0:A +/- 0:CF CF:NF = CF:NF +/- 0:B This patch fixes the issue. Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com> --- target-arm/translate.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-)